ACS Request Processor

by Jon Salz

ACS Documentation : ACS Core Architecture Guide : ACS Request Processor


This document describes the request processor, a series of Tcl procedures which handles every single HTTP request made to an AOLserver running ACS.

The Big Picture

In the early days of the Web, before the dawn of database-driven, dynamic content, web servers maintained a very straightforward mapping between URLs and files. In response to a request for a particular URL, servers just prepended a document root path like /web/arsdigita/www to the URL, serving up the file named by that path.

This is no longer the case: the process of responding to a request involves many more steps than simply resolving a path and delivering a file. Serving an ACS page involves (at the very least) reading security information from HTTP cookies, extracting subcommunity information from the URL, calling filters and registered procedures, invoking the abstract URL system to determine which file to serve, and then actually serving the file.

The traditional way to realize this process was to register a plethora of filters and procedures using ns_register_filter and ns_register_proc, but there were several problems with this approach:

ACS contains a request processor which manages filter and procedure registration to fix these problems. ns_register_filter and ns_register_proc are deprecated. They are replaced by ad_register_filter and ad_register_proc which have similar syntaxes but support filter priorities.

How It Works

The request processor disables the ns_register_filter and ns_register_proc API calls - using them results in an error (unless called directly by request processor code). ad_register_filter and ad_register_proc, their replacements, don't actually register anything immediately when invoked. Instead they just build up a list, in NSV arrays, of filters and procedures.

After server initialization, the request processor examines the list of registered filters, sorts them by priority, and registers them with AOLserver using ns_register_filter. It also registers rp_handler as the very first filter to run.

When the request processor uses ns_register_filter to register those filters in the ad_register_filter list, it doesn't call ns_register_filter with literally the same procedure name as was provided to ad_register_filter. Instead, it registers the rp_invoke_filter wrapper, which does a few things before dispatching to the handler procedure:

  1. Verifies that the package which registered the filter is active for the current subsite. E.g., if the bboard package has registered a filter for /bboard/view-attachment/*, and the URL being requested is /users/jsalz/bboard/view-attachment/3.html, and the bulletin-board package is disabled for the /users/jsalz subsite, then the filter will not be invoked.
  2. Wraps the call to the actual handler procedure in a catch statement, so a reasonable error message is displayed if the filter throws an error or returns an invalid result.
The request processor's treatment of registered procedures is analogous to its handling of filters. It uses rp_invoke_proc to invoke registered procedures.

Steps in the Pipeline

  1. In the rp_handler filter:
    1. Global initialization. Initialize the ad_conn global variable, which contains information about the connection (see ad_conn below).

    2. Examine the URL for subsite information. If the URL belongs to a subsite (e.g. /users/jsalz/address-book/ belongs to jsalz's subsite), strip the subsite information from the URL and save it in the environment to be later accessed by ad_conn. If no subsite is specified, then the URL is assumed to correspond to the main subsite (the subsite belonging to the_public object).

      After this step, ns_conn url returns the modified URL (without the subsite prefix).

    3. Developer support. Call the hook to the developer support subsystem, if it exists, to save information about the active connection.

    4. Library reloading. If the package manager has been instructed to reload any *-procs.tcl files, source them. Also examine any files registered to be watched (via the package manager); if any have been changed, source them as well.

    5. Host header checking. (Skipped for system URLs.) Check the HTTP Host header to make sure it's what we expect it to be. If the Host header is present but differs from the canonical server name (as reported by ns_info location), issue an HTTP redirect using the the correct, canonical server name.

      For instance, if someone accesses the URL http://arsdigita.com/pages/customers, we redirect them to http://www.arsdigita.com/pages/customers since the canonical host name for the server is www.arsdigita.com, not arsdigita.com.

      Security handling. (Skipped for system URLs.) Examine the security cookies, ad_browser_id and ad_session_id. If either is is invalid or not present at all, issue a cookie and note information about the new browser or session in the database.

    At this point, any other filters and procedures registered with ad_register_filter and ad_register_proc are executed. Their paths are matched the URL with subsite information stripped, e.g., a filter registered for /bboard/view-attachments is executed for a URL of the form /users/jsalz/bboard/view-attachments/3.html (provided that the bulletin-board package is enabled for that subsite).

    If no filter returns filter_return and no registered procedure matches, then the abstract URL system is invoked to resolve the URL to a file:

    1. If a prefix of the URL has been registered with rp_register_directory_map, map to the associated directory in the filesystem. For example, if we've called
      rp_register_directory_map "apm" "acs-kernel" "apm-docs"
      
      then all requests under the /apm URL stub are mapped to the acs-kernel package directory www/apm-docs, and all requests under /admin/apm are mapped to the acs-kernel package directory admin-www/apm-docs.

    2. If a prefix of the URL corresponds to a package key registered with the package manager, then map to the www or admin-www directory in that package. For example, if there's a package named address-book, then requests under /address-book are mapped to the /packages/address-book/www directory, and requests under /admin/address-book are mapped to /packages/address-book/admin-www.

    3. Otherwise, just prepend the document root (usually something like /web/arsdigita/www) to the URL, just like AOLserver always used to do.

    Now check to see if the path refers to a directory without a trailing slash, e.g. a request to http://www.arsdigita.com/address-book. If this is the case, issue a redirect to the directory with the trailing slash, e.g. http://www.arsdigita.com/address-book/. This is necessary so the browser will properly resolve relative HREFs.

    Next determine which particular file to serve. If our file name is filename, check to see if any files exist which are named filename.*, i.e. we try automatically adding an extension to the file name. If the URL ends in a trailing slash, then no file name is provided so we look for an index.* file instead. Give precedence to particular extensions in the order specified by the ExtensionPrecedence parameter, e.g. in general prefer to serve .tcl files rather than .adp files.

  2. Call the appropriate handler for the file type.

    1. If it's a TCL (.tcl) file, source it; if it's an ADP (.adp) file, parse it.

    2. If it's a file with some extension registered with rp_register_extension_handler, use that handler to serve the file. For example, if I call
      rp_register_extension_handler jsp jsp_handler
      
      then if the file is a JSP (.jsp) file, the jsp_handler routine will be called and expected to return a page.

    3. If it's some form of static content (like a GIF, JPEG, or HTML file, or anything else besides the file types listed above), just serve the file verbatim, guessing the MIME type from the suffix.

API

As of ACS 3.3, ns_register_filter and ns_register_proc are dead - trying to use them causes an error. Instead of these two procedures, you'll need to use ad_register_filter and ad_register_proc, drop-in replacements which provide the same functionality but are integrated into the request processor.
ad_register_filter when method URLpattern script [ args ... ]
ad_register_proc [ -noinherit f ] method URL procname [ args ... ]
Drop-in replacements for the obsoleted routines ns_register_filter and ns_register_proc. See the AOLserver documentation for syntax.

ad_script_abort
Halts processing of the active request, aborting execution of the current filter, registered procedure, or Tcl script. Similar to return -code return, except that it works in any stack frame.
ad_conn which
Returns information about the current connection (analagous to ns_conn). Allowed values for which are:

These values are set in the request-processor. Some values are not available at various points in the page-serving process; for instance, ad_conn file is not available in preauth/postauth filters since path resolution is not performed until after filters are invoked.

Parameters

[ns/server/yourservername/request-processor]
; log lots of timestamped debugging messages?
DebugP=0
; URL sections exempt from Host header checks and security/session handling.
; (can specify an arbitrary number).
SystemURLSection=SYSTEM
; precedence for file extensions, e.g., "tcl,adp,html" means "serve
; a .tcl file if available, else an .adp file if available, else an
; .html file if available, else the first file available in alphabetical
; order". Comma-separated.
ExtensionPrecedence=tcl,adp,html,jpg,gif

History

In ACS 3.3 we introduced the first unified request processor implementing the series of actions above. It was written purely in Tcl as a single procedure (not a mess of ns_register_filters and ns_register_procs), allowing us a great deal of control over exactly what happens when we deliver a response to an HTTP request. We also introduced new APIs, ad_register_filter and ad_register_proc, analogous to existing AOLserver APIs (ns_register_filter and ns_register_proc) but tightly integrated into the request processor.

However, Tcl is slower than C and many people questioned the wisdom of essentially porting so much of AOLserver's functionality into Tcl. In ACS 4.0 we add a C module to AOLserver, nsrewrite, allowing us to rewrite URLs for subsites (so that ns_register_filter and ns_register_proc can work properly) and implement the request processor using the ns_* primitives instead of handling every request ourselves in Tcl.

ns_register_filter and ns_register_proc are still deprecated - we maintain the ad_register_filter and ad_register_proc abstractions (and continue to require their use) so we can handle filter priorities, provide debugging support, and facilitate the use of templating at any stage of the request process (e.g., even in a filter returning filter_return).


jsalz@mit.edu

Last Modified: request-processor-3.x.html,v 1.1 2001/01/21 01:38:18 bquinn Exp