ACS Package Manager (APM)

by Jon Salz, Michael Yoon, and Lars Pind

ACS Documentation : ACS Core Architecture Guide : ACS Package Manager (APM)


The Big Picture

In general terms, a package is a unit of software that serves a single well-defined purpose. That purpose may be to provide a service directly to one or more classes of end-user, (e.g., discussion forums and file storage for community members, user profiling tools for the site publisher), or it may be to act as a building block for other packages (e.g., an application programming interface (API) for storing and querying access control rules, or an API for scheduling email alerts). Thus, packages fall into one of two categories:

The ACS itself a collection of interdependent library and application packages. Prior to ACS 3.3, all packages were lumped together into one monolithic distribution without explicit boundaries; the only way to ascertain what comprised a given package was to look at the top of the corresponding documentation page, where, by convention, the package developer would specify where to find:

Experience has shown us that this lack of explicit boundaries causes a number of maintainability problems for pre-3.3 installations:

  1. Package interfaces were not guaranteed to be stable in any formal way, so a change in the interface of one package would often break dependent packages (which we would only discover through manual regression testing). In this context, any of the following could constitute an interface change:

    This last point is especially important. In most cases, changing the data model should not affect dependent packages. Rather, the package interface should provide a level of abstraction above the data model (as well as the rest of the package implementation). Then, users of the package can take advantage of implementation improvements that don't affect the interface (e.g., faster performance from intelligent denormalization of the data model), without having to worry that code outside the package will now break.

  2. A typical ACS-backed site only uses a few of the modules included in the distribution, yet there was no well-understood way to pick only what you need when installing the ACS, or even to uninstall what you didn't need, post-installation. Unwanted code had to be removed manually.

  3. Releasing a new version of the ACS was complicated, owing again to the monolithic nature of the software. Since we released everything in the ACS together, all threads of ACS development had to converge on a single deadline, after which we would undertake a focused QA effort whose scale increased in direct proportion to the expansion of the ACS codebase.

  4. There was no standard way for developers outside of ArsDigita to extend the ACS with their own packages. Along the same lines, ArsDigita programmers working on client projects had no standard way to keep custom development cleanly separated from ACS code. Consequently, upgrading the ACS once installed was an error-prone and time-consuming process.
The ACS is basically a platform for web-based application software, and any software platform has the potential to develop problems like these. Fortunately, there are many precedents for systematic ways of avoiding them, including:

Borrowing from all of the above, ACS 3.3 introduces its own package management system, the ACS Package Manager (APM), which consists of:

Consistent use of the APM format and tools will go a long way toward solving the maintainability problems listed above. Moreover, APM is the substrate that will enable us to soon establish a central package repository, where both ArsDigita and third-party developers will be able publish their packages for other ACS users to download and install.

For a simple illustration of the difference between ACS without APM (pre-3.3) and ACS with APM (3.3 and beyond), consider a hypothetical ACS installation that uses only two of the thirty-odd modules available circa ACS 3.2 (say, bboard and ecommerce):

ACS, without APM vs. with APM

APM itself is part of a package, ACS Core, a library package that is the only mandatory component of an ACS installation.

The Components of an APM Package

An APM package consists of:
  1. A set of interfaces
  2. Implementations of those interfaces
  3. Documentation
  4. A package specification

Package Interfaces

There are three types of interface that an APM package can define: By definition, an application package provides a UI but may or may not provide an API. Conversely, a library package provides an API but may or may not provide a UI. A configuration interface is optional for either type of package.

Package Implementation

Implementation varies by type of interface: (Note that we now consider the database schema to be part of the package implementation, not the package interface. In other words, the only code that should execute queries or DML against a package's schema is the package's own implementation code. There are legacy violations of this rule that will be corrected incrementally.)

Package Documentation

A package must contain one or more of the following types of documentation:

Package Specification: The .info file

The package specification is an XML document that lists: Package specifications are typically not authored manually; rather, APM provides a UI for

Here is a sample excerpt from the specification of the ACS Core package itself:

<?xml version="1.0"?>
<!-- Generated by the ACS Package Manager -->

<package key="acs-core" url="http://software.arsdigita.com/packages/acs-core">
    <version name="3.3.0" url="http://software.arsdigita.com/packages/acs-core-3.3.0.apm">
        <package-name>ACS Core</package-name>
        <owner url="mailto:jsalz@mit.edu">Jon Salz</owner>
        <summary>Routines and data models providing the foundation for ACS-based Web services.</summary>
        <release-date>2000-06-03</release-date>
        <vendor url="http://www.arsdigita.com/">ArsDigita Corporation</vendor>

        <provides url="http://software.arsdigita.com/packages/developer-support/tcl-api" version="0.2d"/>
        <!-- No included packages -->

        <files>
            <file type="tcl_procs" path="00-proc-procs.tcl"/>
            <file type="tcl_procs" path="10-database-procs.tcl"/>
            ...
        </files>
    </version>
</package>
The only attributes of the <package> element itself are key and url. The key attribute is a default short name for the package that appears in the APM site administrator UI; to enable the prevention of namespace collision, the key is not fixed but can be changed within an ACS installation. The url attribute identifies the authoritative distribution point for the package (specifically, a directory from which all versions of the package can be obtained). It also serves as the package's universally unique identifier and therefore cannot be changed.

All other properties of the package are stored as attributes and child elements of the <version> element, since they can vary from version to version. The <version> element also has two attributes: name and url. The name attribute is actually a version number that conforms to the numbering convention defined below. It is called name instead of number, because it can be alphanumeric, not purely numeric. The name attribute also designates the maturity of the package: development, alpha, beta, or release. As with the <package> element, the url attribute identifies the authoritative distribution point for the specified version of the package (specifically, the location of an actual package file that can be downloaded) and serves as the package version's universally unique identifier.

The version element contains:

A <provides> or <requires> element identifies an interface with the combination of its url and version attributes, where url is a universally unique identifier for the interface (API or UI) and version is an identifier that conforms to the same version numbering convention used for packages. The convention for constructing an interface URL is:
http://vendor-host/packages/logical-name/implementation-type
In the above example, the vendor-host is software.arsdigita.com, the logical-name is developer-support, and the implementation-type is tcl-api. Other implementation-type values include plsql-api, sql-views, and java-api. (At this time, the result of visiting an interface URL is undefined; in the future, it will display the documentation for the identified interface.)

Once an interface is published in an <provides> element, future versions of the package must maintain that interface, i.e., no changes can be made to the interface or its implementation that would cause dependent code to break. The interface can be augmented, in which case the version number should be incremented, i.e., a later version of an interface is always the superset of an earlier version. To communicate the fact that an incompatible change has been made to an interface, the package owner will remove the original <provides> element and add a new, different <provides> element, e.g., hypothetically, we might someday replace developer-support/tcl-api with developer-support/tcl-api-2.

Also, a <provides> element can include a deprecated attribute, meaning that the package owner expects to remove the corresponding interface in the future.

Version Numbering Convention

A version number consists of:
  1. A major version number.
  2. Optionally, up to three minor version numbers.
  3. One of the following:

In addition, the letters d, a, and b may be followed by another integer, indicating a version within the release.

For those who like regular expressions:

version_number := integer ('.' integer){0,3} (('d'|'a'|'b') integer?)?

So the following is a valid progression for version numbers:

0.9d, 0.9d1, 0.9a1, 0.9b1, 0.9b2, 0.9, 1.0, 1.0.1, 1.1b1, 1.1

Distribution Format: The .apm file

In Maximum RPM, Edward Bailey writes:
Normally, package management systems take all the various files containing programs, data, documentation, and configuration information, and place them in one specially formatted file -- a package file.
This description fits APM packages, which are distributed as gzip-compressed tarfiles, with the special extension .apm. The full naming convention for APM package files is:
package-key-package-version-name.apm
For instance, the first production release of the ACS Core package is named acs-core-3.3.0.apm.

Inside the tarfile, there is one directory at the top level, with the same name as the package key, which, in turn, contains:

Aside from the package specification, all items listed above are optional.

ACS Directory Structure

APM installs packages in the packages subdirectory of the server root directory, at the same level as the legacy www, tcl, and parameters directories (which, by the way, continue to serve the same purposes as they did in versions of ACS prior to 3.2; we may remove some of this backward-compatibility in ACS 4).

Thus, the directory structure of the hypothetical ACS 3.3 installation that is illustrated in the diagram above would look something like this:

server-root/
  |
  +-- packages/
        |
        +-- acs-core/
        |
        +-- bboard/
        |     |
        |     +-- doc/
        |     |     |
        |     |     +-- index.html
        |     |     |
        |     |     +-- ...
        |     |
        |     +-- www/
        |     |     |
        |     |     +-- admin/
        |     |     |     |
        |     |     |     +-- index.adp
        |     |     |     |
        |     |     |     +-- ...
        |     |     |
        |     |     +-- index.adp
        |     |     |
        |     |     +-- ...
        |     |
        |     +-- bboard.info
        |     |
        |     +-- bboard.sql
        |     |
        |     +-- bboard-init.tcl
        |     |
        |     +-- bboard-procs.tcl
        |     |
        |     +-- ...
        |
        +-- ecommerce/
              |
              +-- ...
Another component of the ACS Core package, the Request Processor, is responsible for making the various package user interfaces integrate into one coherent hierarchy of URLs. The basic algorithm used to translate a URL into a filesystem path is simple: "When an HTTP request for /package-key/filename is received, then return the file server-root/packages/package-key/www/filename." (In reality, the job of the Request Processor is not so simple.)

Changes From ACS 3.2 and Prior Versions

Prior to the introduction of APM in ACS 3.3, the contents of a given package were scattered throughout the site's physical structure: In contrast, APM imposes a vertical organization wherein the filesystem does not map directly to the URL hierarchy. The main advantage of the pre-APM filesystem organization was the fact that, given a URL, you always knew where to look for the corresponding file under the page root. In our judgement, the benefit of having the filesystem explicitly preserve the modularity of installed packages outweighs this advantage, and the extra complexity that's now built into the Request Processor.

Future Improvements

Under the Hood

At startup, the ACS Core scans all package specifications and synchronizes them with the database. Mismatches (indicating that new packages have been installed) will result in appropriate action (running upgrade scripts or notifying the administrator).
michael@arsdigita.com