|Interactive web services are increasingly replacing traditional static web pages. Producing web services seems to require a tremendous amount of laborious low-level coding due to the primitive nature of CGI programming. We present ideas for an improved runtime system for interactive web services built on top of CGI running on virtually every combination of browser and HTTP/CGI server. The runtime system has been implemented and used extensively in <bigwig>, a tool for producing interactive web services.|
An interactive web service consists of a global shared state (typically a database) and a number of distinct sessions that each contain some local private state and a sequential, imperative action. A web client may invoke an individual thread of one of the given session kinds. The execution of this thread may interact with the client and inspect or modify the global state.
One way of providing a runtime system for interactive web services would be to simply use plain CGI scripts . However, being designed for much simpler tasks, the CGI protocol by itself is inadequate for implementing the session concept. It neither supports long sessions involving many user interactions nor any kind of concurrency control. Being the only widespread standard for running web services, this has become a serious stumbling stone in the development of complex modern web services.
We present in this paper a runtime system built on top of the CGI protocol that among other features has support for sessions and concurrency control. First, we motivate the need for a runtime system such as the one presented here. This is done by presenting its advantages over a simple CGI script based solution. Afterwards, a description of the runtime system, its different parts, and its dynamic behavior is given. We round off with a discussion of related work, a conclusion, and directions for future work.
In the appendices, we briefly describe an implementation of the suggested runtime system. Also, we give a short presentation of <bigwig> , which is a tool for producing interactive web services that makes extensive use of the self-contained runtime system package.
The technology of plain CGI scripts lacks several of the properties one would expect from a modern programming environment. In the following we discuss various shortcomings of traditional CGI programming and motivate our solution to these problems, namely the design of an improved runtime system built on top of the standard CGI protocol.
First, we will describe and motivate the concept of an interactive web service.
The HTTP protocol was originally designed for browsing static documents connected with hyperlinks. CGI together with forms allows dynamic creation of documents, that is, the contents of a document are constructed on the server at the time the document is requested. Dynamic documents have many advantages over static documents. For instance, the contents of the documents can be tailor-made, and up-to-date.
A natural extension of the dynamic-document model is the concept of interactive services, which is illustrated in Figure 1.
Here the client does not browse a number of more or less independent statically or dynamically generated pages but is guided through a session controlled by a session thread on the server. This session can involve a number of user interactions. The session is initiated by the client submitting a ``start session'' request. The server then starts a thread controlling the new session. This thread generates a reply page which is sent back to the client. The page typically contains some input fields that are filled in by the client. That information is sent to the server, which then generates the next reply, and so on, until the session terminates.
This session concept allows a large class of services to be defined. However, a number of practical problems needs to be solved in order to implement this model on top of the CGI model.
As explained above, a web service session consists of a sequential computation that along the way presents information to the client and waits for replies. However, CGI is a state-less protocol, meaning that execution of a CGI script only lasts until a page is shown to the web client. This fact makes it rather tedious to program larger web services involving many client interactions. The sequential computation has to be split up into the small bits of computation that happen between client interactions. Each of these small bits will then constitute a CGI script or an instance of a CGI call.
Furthermore, to achieve persistency of the local state, one has to store and restore it explicitly between CGI-calls, for instance ``hidden'' in the web page sent to the client. For simple services where the full session approach is not needed this stateless-server approach might be preferable, but it is clearly inadequate in general.
Thus, the problem of forced termination of the CGI script at each client interaction is two-fold:
We provide a simple solution which splits CGI scripts into two components, namely connectors and session threads. A connector is a tiny transient CGI script that redirects input to a session thread, receives the response from that thread, and redirects it back to the web client. The session threads are persistent processes running residently on the web server. They survive CGI calls and can therefore implement a long sequential computation involving several client interactions. The use of transient connectors and persistent session threads decreases the difficulty of writing and maintaining web services. Furthermore, it improves substantially on the overhead of the web server during execution of a service.
Traditionally, reply pages from session threads are sent directly to the client. That is, the session thread (or the connector if using the system described above) writes the page to standard-output and the web server sends it on to the client browser. This basic approach imposes some annoying problems on the client:
We suggest a simple solution where--instead of sending the reply itself--the session thread writes its reply to a file visible to the client and then sends to the client a reference to the reply file. By choosing the same URL for the duration of the session, this reference can then function as an identification of that particular session. This solves both the problem with bookmarks and with the ``back'' button. Pressing ``back'' will now bring the client back to the web page where he started the session, which seems like a natural effect.
This method also opens up for an easy solution to another problem. Sometimes the server requires a long time to compute the information for the next page to be shown to the client. Naturally, the client may become impatient and lose interest in the service or assume that the server or the connection is down if no response is received within a certain amount of time. If confirmation in the form of a temporary response page is sent, the client will know that something is happening and that waiting will not be in vain.
This extra feature is implemented in the runtime system as follows. If a response is not ready within for instance 8 seconds, the connector responds with a reference to a temporary page (for instance saying ``please wait'') and terminates. This page will then automatically be loaded by the clients web browser and reload itself, say every 5 seconds. Once the session thread finishes its computation and the real response page is ready, the thread just replaces the temporary page with the real response page. This will have the effect that next time the page is reloaded, the real response page will be shown to the client.
This reloading can be done with standard HTML functionality. Of course the reloading causes some extra network traffic, but using this method is probably as close as one gets to server pushing in the world of CGI programming.
Another serious problem with traditional CGI programming is that concurrency control, such as synchronization of sessions and locking of shared variables, gets handled in an ad-hoc fashion. Typically, this is done using low-level semaphores supplied by the operating system.
As a result, web services often implement these aspects incorrectly resulting in unstable execution and sometimes even damaging behavior.
Our solution allows one to put safety requirements, such as mutual exclusion or much more complex requirements, separately in a centralized supervising process called the controller. This approach significantly simplifies the job of handling safety requirements. Also, since each of the requirements can be formulated separately, the solution is much more robust towards changes in various parts of the code.
It is generally considered inefficient and unsafe to have centralized components in distributed systems. However, in this case the bottleneck is more likely to be the HTTP/CGI server and the network than the safety controller. In spite of that, we do try to distribute the functionality of our safety controller as discussed in Section 5.
At any time there will be a number of web clients accessing the HTTP/CGI server through the CGI protocol. On the server side we will have a controller and a number of session threads running. The session threads access the global data and produce response pages for the web clients. From time to time a connector will be started as the result of a request from a web client. The connector will make contact with the running session thread. A connector is shut down again after having delegated the answer from a session thread back to the web client.
In the following we give a more detailed description of these components. For an overview of the components in the runtime system, see Figure 2.
Furthermore, the runtime system also contains a global-state database (could be the file-system or a full-fledged database), and a service manager, which takes care of garbage-collecting abandoned session threads and other administrative issues.
In this section we describe the dynamic behavior of the runtime system. We start by explaining the overall structure of the execution of a session thread. Starting from this, we present each of the possible thread transitions.
First, it is described how a session thread is started. Then, transitions involving interaction with a web client, that is, showing web pages and getting replies, are dealt with. Finally, the transitions involving interaction with the controller are presented.
For each transition we give a description of the components involved and their interaction.
The lifetime of a session thread is depicted in the diagram in Figure 3.
When a thread is first started, it enters the state active. Here it can do all sorts of computations.
Eventually it reaches a point where it has composed a response HTML page. This page is shown to the web client and the thread enters the state showing. Here it waits for the web client to respond via yet another HTTP/CGI request. Upon re-submission the thread reenters the state active and resumes execution.
Note that in the world of naive CGI programming when moving from active to showing and back one would have to store a complete image of the local state before terminating the script. Then, when started again a new process would be started and the local state would have to be reconstructed from the image that was saved. This substantial overhead of saving and restoring local state is avoided completely by the use of transient connectors and resident threads.
While in state active a thread can get to a point in execution where safety critical computation, such as accessing a shared resource, needs to be carried out. When reaching such a point the thread asks the controller for permission to continue and enters the state waiting. When permission is granted from the controller the thread reenters the active state and continues execution.
With a traditional approach one would have to merge the code implementing the intricate details dealing with concurrency control with the service code. This intermixing would in addition to substantially reducing the readability of the code also increase the risk of introducing errors. Our solution separates the code dealing with concurrency control from the service code.
When the session is complete, the thread will leave the state active and end its execution.
This section describes the transition from start to active.
When a new web client makes an HTTP/CGI request, the server will start up a new connector as a CGI script. Since this request is the first one made by the web client, a new thread is started according to the session name given in the request. As will be described later, a response page will be sent back to the client when the thread reaches a show call or a certain amount of time, for instance 8 seconds, has passed.
When a session thread is initiated or when it moves from showing to active, the contents of the reply file is immediately overwritten by a web page containing a ``reply not ready--please wait'' message and a ``refresh'' HTML command. The ``refresh'' command makes the browser reload the page every few seconds until the temporary reply file is overwritten by the real reply as described in the following section. The default contents of the ``please wait'' page can be overridden by the service programmer by simply overwriting the reply file with a message more appropriate for the specific situation.
During execution of a running thread the service can show a page to the web client and continue execution when receiving response from the client. In the following we describe these two actions.
This section describes the transition from active to showing.
During execution of a session thread one can do computations, inspect the input from the client, produce response documents, etc. When a response document has been constructed and the execution reaches a point where the page is to be shown to the client, the following actions will be taken:
In Figure 2, these actions describe a flow of data starting at the session thread and ending at the client.
This section describes the transition from showing to active.
While the session thread is sleeping in the showing state, the web client will read the page, fill out appropriate form fields, and resubmit. This will result in the following flow of data from the client to the session thread (see Figure 2):
The controller allows the programmer to restrict the execution of a web service in such a way that stated safety requirements are satisfied.
Threads have built-in checkpoints at places where safety critical code is to be executed. At these checkpoints the thread must ask the controller for permission to continue. The controller, in turn, is constructed in such a way that it restricts execution according to the safety requirements and only allow threads that are not about to violate the requirements to continue.
In the following we describe in further detail the controller itself, what happens when session threads ask for permission, and how permission is granted by the controller.
The controller consists of three parts: some control logic, a number of checkpoint-event queues, and a timeout queue. Figure 4 gives an overview of the controller.
The control logic is the actual component representing the safety requirements. It controls whether events are enabled, and hence when the various session threads may continue execution at checkpoints. One could imagine various approaches, such as, the use of finite state machines or petri-nets. For that reason, the internals of the control logic are not specified here. The only requirement is that the interface must contain the following two functions available to the runtime system:
We explain in the following how these functions are used in the controller.
The checkpoint-event queues form the interface to the running threads of the service. There is a queue for each possible checkpoint event. When a thread reaches a checkpoint it asks the controller for permission to continue by adding its process-ID onto the queues corresponding to the events it wants to wait for at the checkpoint.
As an extra feature one can specify a timeout when asking the controller for permission to continue. For this purpose the controller has a timeout queue. If permission is not granted within the specified time bound, the controller wakes up the thread with the information that permission has not been granted yet, but a timeout event has occurred. The specified timeouts are put in the special timeout queue (which is implemented as a priority queue).
This section describes the transition from active to waiting.
As mentioned earlier, one has the possibility of adding checkpoints to session code where critical code is to be executed. The runtime system interface makes some functions available to the service programmer for specifying checkpoints. Conceptually, the programmer uses them to specify a ``checkpoint statement'' as illustrated with an example in Figure 5.
This example would have the effect that whenever a thread instance of this session reaches this point it will do the following:
When the controller is up and running, it loops doing the following:
If several events become enabled, a token-ring scheduling policy is used. This ensures fairness in the sense that if a thread waits for an enabled event, it will at some point be granted permission to continue.
This section describes the transition from waiting to active.
Having sent a request for permission to continue the thread is sleeping, waiting for the controller to make a response. If a ``permission granted'' signal is sent to the thread, it wakes up and continues, branching according to the event signaled by the controller. In the example checkpoint in Figure 5, if the controller grants permission for an E1 event, execution is continued at the code following case E1. If the controller sends a ``timeout'' signal, execution continues after timeout.
The runtime system described in the previous sections can be extended in several ways. The following extensions either have been implemented in an experimental version of the runtime system package or will be in near future. With these extensions, we believe that we begin reaching the limits of what is possible with the standard CGI protocol and the current functionality of standard browsers.
To smoothen presentation, we have so far described the controller as one centralized component. In most cases it is possible to divide the control logic into independent parts controlling disjoint sets of checkpoint events. The controller can then be divided into a number of distributed control processes [10, 11]. This way the problem of the controller being a bottleneck in the system is successfully avoided.
Using the idea of connectors and controllers, one can construct a ``remote service monitor'', that is, a program run by a super-client, which is able to access logs and statistics information generated by the connectors and controllers, and to inspect and change the global state and the state of the control logic in the controllers. This can be implemented by having a dedicated monitor process for each service.
The system presented here is quite vulnerable to hostile attacks. It is easy to hijack a session, since the URL of the reply file is enough to identify a session. A simple solution is to use random keys in the URLs, making it practically impossible to guess a session ID. Of course, all information sent between the clients browser and the server, such as the session ID and all data written in forms, can still be eavesdropped. To avoid this, we have been doing experiments with cryptography, making all communication completely secure in practice. This requires use of browser plug-ins, which unfortunately has not been standardized. The protocols being used in the experiments are RSA, DES3, and RIPE-MD160. They prevent hijacking, provide secure channels, and verify user ID--all transparently to the client.
In the session concept illustrated in Figure 1, only one page is generated and shown to the client at a time. However, often the service wants to generate a whole ``cluster'' of linked documents to the client and let the client browse these documents without involving the session thread. With the current implementation, a solution would be to program the possibility of browsing the cluster into the service code--inevitably a tedious and complicated task.
Document clusters can be implemented by simply having a reply file for each document in the cluster. Recall, however, that in the presented setup, the name of the reply file was fixed for the duration of a session. That way, the history buffer of the browser got a reasonable functionality. Therefore, to get that functionality we need a somewhat different approach: the reply files are not retrieved directly by the HTTP server but via a connector process. This connector receives the ID of the session thread in the CGI query string and the document number in a hidden variable.
If all server processes (the session threads, safety controllers, etc.) are running on the same machine, that is, the possibility of distributing the processes is not being exploited, they might as well be combined into a single process using light-weight threads. This decreases the memory use (unless the operating system provides transparent sharing of code memory) and removes the overhead of process communication. The resulting system becomes something very close to being a dedicated web server. The important difference being that it still builds upon the CGI protocol.
The idea of having persistent processes running residently on the server is central in the FastCGI  system. One difference is that FastCGI requires platform- and server-dependent support, while our approach works for all servers that support CGI. Also, our runtime system is tailored to support more specific needs.
A more detailed and formal description of how one can make use of safety requirements written separately in a suitable logic can be found in [11, 2]. A language for writing safety requirements is presented, the compilation process into a safety controller is described, and optimizations for memory usage and flow capacity of the controller are developed. A recent paper  generalizes these ideas resulting in a standard scheme for generating controllers for discrete event systems with both controllable and uncontrollable events.
The Mawl language [1, 3, 7] has been suggested as a domain-specific language for describing sequential transaction-oriented web applications. Its high-level notation is also compiled into low-level CGI scripts. Mawl directly provides programming constructs corresponding to global state, dynamic document, sessions, local state, imperative actions, and client interactions. This system shows great promise to facilitate the efficient production of reliable web services. While Mawl thus offers automatic synthesis of many advanced concepts, it still relies on standard low-level semaphore programming for concurrency control. Also, it does not have a FastCGI-like solution but in instead it is possible to compile a service into a dedicated server for that particular service. Though being faster than using simple CGI scripts this solution is, as opposed to using a FastCGI-like solution, not easily ported between different machine architectures.
The implementation as briefly described in Appendix A constitutes the core of the <bigwig> tool which currently is being developed at BRICS. In the <bigwig> tool, the runtime system we propose here has shown to provide simple and efficient solutions to problems occurring more and more often due to the increased use of interactive web services. Furthermore, the session concept seems to constitute a framework which is very natural to use for designing complex services. By basing the design of the runtime system on very widely used protocols, the system is easy to incorporate. The further development of the runtime system can be followed on the <bigwig> homepage .
A UNIX version of the runtime system has been implemented (in C) as a package ``runwig'' containing the following components (corresponding to Figure 2):
An experimental version of the runtime package implements the extensions described in Section 5. The runwig package--including all source code, detailed documentation, and examples--is available at http://www.brics.dk/bigwig/runwig/.
The <bigwig> language is really a collection of tiny domain-specific languages focusing on different aspects of interactive web services. To minimize the syntactic burdens, these contributing languages are held together by a C-like skeleton language. Thus, <bigwig> has the look and feel of C-programs with special data- and control-structures.
A <bigwig> service executes a dynamically varying number of threads. To provide a means of controlling the concurrent behavior, a thread may synchronize with a central controller that enforces the global behavior to conform to a regular language accepted by a finite-state automaton. That is, the 'control logic' in <bigwig> consists of finite-state automata. The controlling automaton is not given directly, but is computed (by the MONA [6, 9] system) from a collection of individual concurrency constraints phrased in first-order logic. Extensions with counters and negated alphabet symbols add expressiveness beyond regular languages.
HTML documents are first-class values that may be computed and stored in variables. A document may contain named gaps that are placeholders for either HTML fragments or attributes in tags. Such gaps may at runtime be plugged with concrete values. Since those values may themselves contain further gaps, this is a highly dynamic mechanism for building documents. The documents are represented in a very compressed format, and the plug operations takes constant time only. A flow-sensitive type checker ensures that documents are used in a consistent manner.
The familiar struct and array datastructures are replaced with tuples and relations which allow for a simple construction of small relational databases. These are efficiently implemented and should be sufficient for databases no bigger than a few MBs (of which there are quite a lot). A relation may be declared to be external, which will automatically handle the connection to some external server. An external relation is accessed with (a subset of) the syntax for internal relations, which is then translated into SQL.
An important mechanism for gluing these components together is a fully general hygienic macro mechanism that allows <bigwig> programmers to extend the language by adding arbitrary new productions to its grammar. All nonterminals are potential arguments and result types for such macros that, unlike C-front macros, are soundly implemented with full alpha-conversions. Also, error messages remain sensible, since they are threaded back through macro expansion. This allows the definition of Very Domain-Specific Languages that contain specialized constructions for building chat rooms, shopping centers, and much more. Macros are also used to wrap concurrency constraints and other primitives in layers of user-friendly syntax.
Version 0.9 of <bigwig> is currently undergoing internal evaluation at BRICS. If you want to try it out, then contact us for more information. The documentation is very rough as yet, but this has a high priority in the next few months. The project is scheduled to deliver a version 1.0 of the <bigwig> tool in June 1999. This will be freely available in an open source distribution for UNIX.
Claus Brabrand received his M.Sc. degree in computer science
from the University of Aarhus, Denmark, in January 1999. He is
currently a research assistant employed at BRICS at the University
of Aarhus on the <bigwig> project. In August 1999,
he will start as a Ph.D. student also at BRICS. Areas of interest
and research include: domain specific languages (DSL), compilers
and tools for rapid DSL construction, syntactic-level macro languages,
programming languages, and Internet services.
Anders Møller is a Ph.D. student at BRICS at the
University of Aarhus, Denmark, and consultant for AT&T Labs
Research. From June through August 1998 he visited the Algorithms
and Specification group at AT&T Labs Research. His main research
interests include programming languages, logic, and verification.
In particular, he works on the BRICS MONA project and the <bigwig>
Anders Sandholm received the B.Sc. and M.Sc. degrees in
computer science from the University of Aarhus, Denmark, in 1994
and 1997, respectively, and expects to hand in his Ph.D. thesis
during the summer of 1999.
From June through August 1996 he was Member of Technical Staff in the Computing Sciences Research Department at Bell Labs, Lucent Technologies, from June through August 1997, a Member of Technical Staff in the Algorithms and Specification group at AT&T Labs Research, and from September through December 1998, a visitor in the Software Production Research department at Bell Labs, Lucent Technologies.
He has worked in the areas of formal verification, semantics, and programming languages. His current research interests are in domain specific languages with emphasis on language design, static program analysis, and applications to control robotics and web programming.
Michael I. Schwartzbach received his Ph.D. (Computer Science)
from Cornell University in 1987. He is an associate professor
at the University of Aarhus and a kernel researcher at the BRICS
Research Center. Michael I. Schwartzbach has studied design and
implementation of programming languages, type systems, static
analysis, and applications of logic.