Download

part of the ArsDigita Community System by Philip Greenspun
implemented by Sarah Ahmed and Usman Mobin

The Big Idea

A publisher of reports or software would like to be able to ask "Who has downloaded any version of Bloatware 2000?" or to say "Spam everyone who downloaded Bloatware 2000 v2.3 and tell them to upgrade to the new exciting 3.0 version". Note that implicit in the foregoing is that users must register before downloading. A publisher might wish to charge for downloading or limit downloading to members of a group.

You'd think that much of this could be accomplished with our standard user-tracking stuff: user_content_map keeps track of which users have grabbed which files; the member value system can store charges for particular files (prices kept in the static_pages table). Some problems with using these modules include the fact that it is tough to annotate or group files in static_pages and the member value system is really more intended for subscribing users who pay once/month

The Medium-sized Idea

For each file to download, we store

Administration Setup

When the administrator decides to add a new downloadable file to the system, first the download directory must be created. In the current version of the module, this takes the form of the "Add New Download" function on the admin pages. Doing this function only creates the directory in which the downloadable files are stored.

Note that the directory into which the /download/ directory is placed (generally /web/server_name/) must be write accessible to nsadmin. This can either be accomplished by making the owner nsadmin or changing the permissions so that nsadmin has write privileges. Either way, if nsadmin can't write to this directory, an error indicating that "Folder could not be created" will result.

Actually placing the files in that directory is a second and seperate step. To upload a file to the server, select the directory name from the admin screens and then choose "Upload New Version" command. This allows the administrator to place a file on the server that is downloadable by users.

As a side note, if the intended use of the Download Module involves documents that can be opened by a program on the receiving machine (e.g. files of type .jgp, , .doc, .pdf, etc.), then the AOLserver's .ini file (not the ACS .ini file) must be modified to make AOLServer aware of these file types. This entry in the .ini file takes the form:

[ns/server/servername/mimetypes]
.doc=application/msword
.ppt=application/powerpoint
.xls=application/excel

User View

To regular users, the downloadable file is visible based on the assigned name assigned when the download directory was created, combined with the version number. This isn't always the most obvious presentation of download filename, and will likely be customized by applications that offer other than software releases.

Downloads with a status of "promote" are offered on the top-level /download/ page. Downloads with a status of "offer_if_asked" are available on a drill-down page of "/download/one" where multiple versions of the same download may be offered. Downloads with a status of "removed" are only shown to the site or group administrator who has the option of changing their status or looking at an old version.

Users can see all files available for download (even before registration). However, upon actual download request of a file, the system checks the user's registration status and the visibility (e.g. "all", "registered_users", or "group_members") of the specific file and permits download accordingly (e.g. non_registered users are only allowed to anonymously download files with visibility="all", he is required to login otherwise).

Implementation

Files and rows are never removed from the database/system. If someone wants to upload a new copy of version 2.3 of Bloatware 2000, it gets a new version_id and the old row is updated with a status of "removed".

We keep a separate table of rules that will allow downloading. The rules can take the following forms:

Under the Hood

The files to download are generally stored outside the Web server root in a configurable location (by default /web/yourservername/download/). We chose not to store the downloads in an Oracle BLOB column. Downloads are presumed to be large and may require several minutes to transmit to a user with a modem. If we kept the files in a BLOB, we'd have to either copy the data first into a /tmp file or hold onto an Oracle connection for the entire download time. For a typical site configured with 8 Oracle connections maximum, this would mean that 8 simultaneous downloaders could wedge the service.

We keep metadata in an Oracle table and we use Oracle-generated keys as filenames so there is no risk of conflict in the event of concurrent updates of files to be downloaded (assuming there is indeed more than one site administrator).

The directory structure bears some mentioning. An objective is to keep this browsable via Emacs and still have some idea of what is going on, i.e., one should not have to also look into Oracle to verify what versions of what downloads are available.

Via clever use of ns_register_proc, analogous to the static file comment attachment system (see /tcl/ad-html), we present URLs to the user for final downloading that look like /download/files/378/bloatware-2000.tar.gz (/download/files/ trips off the ns_register_proc; after that the format is /**version_id**/**psuedo-filename**). This will result in their browser defaulting them with a reasonable filename to "save as".

Deep under the hood

create sequence download_id_sequence;

create table downloads (
	download_id		integer primary key,
	-- if scope=public, this is a download for the whole system
        -- if scope=group, this is a download for/from a subcommunity
        scope           varchar(20) not null,
	-- will be NULL if scope=public 
	group_id	references user_groups on delete cascade,
	-- e.g., "Bloatware 2000"
	download_name	varchar(100) not null,
	-- e.g., "bw2000" (valid UNIX directory name)
	directory_name	varchar(100) not null,
	-- primary description of the item
	description		varchar(4000),
	-- is the description in HTML or plain text (the default)
	html_p			char(1) default 'f' check(html_p in ('t','f')),
	-- when the download was created, who created it, etc.
	creation_date		date default sysdate not null,
	creation_user		not null references users(user_id),
	creation_ip_address	varchar(50) not null,
        -- state should be consistent
	constraint download_scope_check check ((scope='group' and group_id is not null) 
                                               or (scope='public'))
);

create index download_group_idx on downloads ( group_id );

create sequence download_version_id_sequence;

create table download_versions (
	version_id		integer primary key,
	download_id		not null references downloads on delete cascade,
	-- when this can go live before the public
	release_date		date not null,
	-- important: this is the file name that will be served up to
	-- the user, e.g. bw2000-1.2.3.tar.gz.  This is completely up
	-- to the administrator since we can't verify the contents of
	-- the downloadable files.
	pseudo_filename		varchar(100) not null,
	-- might be the same for a series of .tar files, we'll serve
	-- the one with the largest version_id
	version		 	varchar(4000),
	version_description	varchar(4000),
	-- is the description in HTML or plain text (the default)
	version_html_p		char(1) default 'f' check(version_html_p in ('t','f')),
	-- status of this version
	status			varchar(30) check (status in ('promote', 'offer_if_asked', 'removed')),
	creation_date		date default sysdate not null ,
	creation_user		references users on delete set null,
	creation_ip_address	varchar(50) not null
);

create sequence download_rule_id_sequence;

create table download_rules (
	rule_id		integer primary key,
	-- one of the following will be not null
	version_id	references download_versions on delete cascade,
	download_id	references downloads on delete cascade,
	-- who is allowed to view the download files?
	visibility	varchar(30) check (visibility in 
	                   ('all', 'registered_users', 'purchasers', 
	                    'group_members', 'previous_purchasers')),
	-- who is allowed to download the files?
	availability	varchar(30) check (availability in 
			   ('all', 'registered_users', 'purchasers',
   			    'group_members', 'previous_purchasers')),
	-- price to purchase or upgrade, typically NULL
	price		number,
	-- currency code to feed to CyberCash or other credit card system
	currency	char(3) default 'USD' references currency_codes,
	constraint download_version_null_check check 
	                (download_id is not null or version_id is not null)
);

-- PL/SQL proc
-- returns 'authorized' if a user can view a file, 'not authorized' otherwise.
-- if supplied user_id is NULL, this is an unregistered user and we 
-- look for rules accordingly

create or replace function download_viewable_p (v_version_id IN integer, v_user_id IN integer)
     return varchar2
     IS 
	v_visibility download_rules.visibility%TYPE;
	v_group_id downloads.group_id%TYPE;
	v_return_value varchar(30);
     BEGIN
	select visibility into v_visibility
	from   download_rules
	where  version_id = v_version_id;
	
	if v_visibility = 'all' 
	then	
		return 'authorized';
	elsif v_visibility = 'group_members' then	

		select group_id into v_group_id
		from   downloads d, download_versions dv
		where  dv.version_id  = v_version_id
		and    dv.download_id = d.download_id;

		select decode(count(*),0,'not_authorized','authorized') into v_return_value
		from   user_group_map 
                where  user_id  = v_user_id 
		and    group_id = v_group_id;
	
		return v_return_value;		
	else
		select decode(count(*),0,'reg_required','authorized') into v_return_value
		from   users 
  	        where  user_id = v_user_id;

		return v_return_value;
	end if; 

     END download_viewable_p;
/
show errors

-- PL/SQL proc
-- returns 'authorized' if a user can download, 'not authorized' if not 
-- if supplied user_id is NULL, this is an unregistered user and we 
-- look for rules accordingly

create or replace function download_authorized_p (v_version_id IN integer, v_user_id IN integer)
     return varchar2
     IS 
	v_availability download_rules.availability%TYPE;
	v_group_id downloads.group_id%TYPE;
	v_return_value varchar(30);
     BEGIN
	select availability into v_availability
	from   download_rules
	where  version_id = v_version_id;
	
	if v_availability = 'all' 
	then	
		return 'authorized';
	elsif v_availability = 'group_members' then	

		select group_id into v_group_id
		from   downloads d, download_versions dv
		where  dv.version_id  = v_version_id
		and    dv.download_id = d.download_id;

		select decode(count(*),0,'not_authorized','authorized') into v_return_value
		from   user_group_map 
		where  user_id  = v_user_id 
		and    group_id = v_group_id;
	
		return v_return_value;		
	else
		select decode(count(*),0,'reg_required','authorized') into v_return_value
		from   users 
		where  user_id = v_user_id;
		
		return v_return_value;
	end if; 

     END download_authorized_p;
/
show errors

-- history 

create sequence download_log_id_sequence;

create table download_log (
	log_id		integer primary key,
	version_id	not null references download_versions on delete cascade,
	-- keep track of who downloaded what
	user_id		references users on delete set null,
	entry_date	date not null,
	ip_address	varchar(50) not null,
	-- keeps track of why people downloaded this particular item
	download_reasons varchar(4000)
);

Legal Transactions

From the admin pages, the administrator can From the user pages, the user can

Future Improvements

We have left the money stuff unimplemented in this version of the download module, although the data model supports"purchaser"/ "previous purchaser" user visibility. It could be tied to a future version of the ecommerce module or be standalone.
ahmeds@mit.edu