Email handlers are among the first user-level programs written for any new operating system, and are one of the few core tools that almost anyone using a computer will use on a regular basis. Most recently, we have seen a blossoming of Web-based email systems such as Hotmail and Yahoo Mail. Why build yet another mail system?
Some of the desirable traits of a mail system are:
Mail systems with this architecture already exist. Oracle provides software that does what the webmail module does, and probably more. CriticalPath is a company that provides out-sourced email with IMAP and web front ends. These may be better than webmail. CriticalPath certainly has the advantage that it requires no effort on the part of the user other than sending them a check every once in a while. However, when I used CriticalPath, it was unreachable or unuseably slow about half the time (usually due to network problems to their server). I ran out of patience attempting to install Oracle Email Server.
The downside to these other systems is lack of control. It is difficult to modify the look or extend the features of an email server without access to the source code. In the case of CriticalPath, you are stuck with what they provide, and cannot integrate it to provide web-based email as a seamless service of your web site. If you are using the ArsDigita Community System, webmail provides the core of a web-based email system that relies on proven, reliable systems to do all of the hard work and that is simple to extend. If you are not using the ACS, then perhaps studying an implementation of a working system will aid you in building something suitable for your own needs.
-- Domains for which we receive email.
create table wm_domains (
-- short text key
short_name varchar(100) not null primary key,
-- fully qualified domain name
full_domain_name varchar(100) not null
);
The wm_domains table contains the domains for which we expect
to receive mail. The short_name field stores the text
key used to differentiate mail to different domains as discussed
above. Qmail must be configured to handle email for these domains
manually.
-- Maps email accounts to ACS users.
create table wm_email_user_map (
email_user_name varchar(100) not null,
domain references wm_domains,
user_id not null references users,
primary key (email_user_name, domain, user_id)
);
wm_email_user_map assigns email addresses to ACS users.
Why not just use the email column in the
users table? This approach lets each ACS user receive
email for more than one email address and also permits flexibility on
which address is published to other registered users. As a row is
inserted into this table, the appropriate .qmail alias files are
created for that user.
-- Main mail message table. Stores body of the email, along
-- with a parsed text version with markers for attachments for MIME
-- messages.
create sequence wm_msg_id_sequence;
create table wm_messages (
msg_id integer primary key,
body clob,
-- plain text portions of MIME message; empty if
-- entire message is of type text/*.
mime_text clob,
message_id varchar(500), -- RFC822 Message-ID field
unique_id integer -- for both POP3 UIDL and IMAP UID
);
create index wm_messages_by_message_id on wm_messages(message_id);
This is the primary message table. It stores the body of the message,
a parsed plain-text version with markers for attachments if this is a
multipart MIME message, a denormalized Message-ID field for easy
reference by Message ID, and yet another ID field for IMAP
bookkeeping. The message_id field is not unique, since the same
message may have been received multiple times.
-- Stores attachments for MIME messages.
create table wm_attachments (
msg_id not null references wm_messages,
-- File name associated with attachment.
filename varchar(600) not null,
-- MIME type of attachment.
content_type varchar(100),
data blob,
format varchar(10) check (format in ('binary', 'text')), -- for interMedia INSO filter
primary key (msg_id, filename)
);
This table stores MIME attachments and associated information. The
format field enables the use of the interMedia INSO filters to search
encoded files, such as Word or PDF documents.
-- Maps mailboxes (folders, in more common terminology) to ACS users.
create sequence wm_mailbox_id_sequence;
create table wm_mailboxes (
mailbox_id integer primary key,
name varchar(100) not null,
creation_user references users(user_id),
creation_date date,
uid_validity integer, -- Needed for IMAP
unique(creation_user, name)
);
A "mailbox" is what other systems would term "folders."
-- Maps messages to mailboxes (and thus to users).
create table wm_message_mailbox_map (
mailbox_id integer references wm_mailboxes,
msg_id integer references wm_messages,
seen_p char(1) default 'f' check(seen_p in ('t','f')),
answered_p char(1) default 'f' check(answered_p in ('t','f')),
flagged_p char(1) default 'f' check(flagged_p in ('t','f')),
deleted_p char(1) default 'f' check(deleted_p in ('t','f')),
draft_p char(1) default 'f' check(draft_p in ('t','f')),
recent_p char(1) default 't' check(recent_p in ('t','f')),
primary key (msg_id, mailbox_id)
);
Messages "belong" to mailboxes, which "belong" to users. This table
exists to map messages to mailboxes and store subsidiary status
information for the messages. Why aren't these just fields in the
wm_messages table? Originally, there was some notion that
identical messages to multiple recipients could be folded together to
save on storage. This may happen at a later date, but raises reference
counting issues.
-- Parsed recipients for a message; enables search by recipient.
create table wm_recipients (
msg_id integer not null references wm_messages,
header varchar(100) not null, -- to, cc, etc.
email varchar(300) not null,
name varchar(200)
);
create index wm_recipients_by_msg_id on wm_recipients(msg_id);
This table contains parsed recipient address information for each message.
-- Headers for a message.
create table wm_headers (
msg_id integer not null references wm_messages,
-- field name as specified in the email
name varchar(100) not null,
-- lowercase version for case insensitive searches
lower_name varchar(100) not null,
value varchar(4000),
-- various parsed versions of the value
time_value date, -- date/time fields
-- email and name, for singleton address fields like From
email_value varchar(300),
name_value varchar(200),
-- original order of headers
sort_order integer not null
);
create index wm_headers_by_msg_id_name on wm_headers (msg_id, lower_name);
Headers are stored separately from the message to aid in
searching. Where applicable, header values are parsed and stored in
typed fields. The original ordering of the headers is maintained, both
so that we can recreate the header block and because order is
significant for certain fields.
-- Table for recording messages that we failed to parse for whatever reason.
create table wm_parse_errors (
filename varchar(255) primary key not null, -- message queue file
error_message varchar(4000),
first_parse_attempt date default sysdate not null
);
If an error occurs while attempting to parse a message, we store a
record of the error in this log for the administrator to review. Only
the first occurrence of an error is logged for any file, to prevent
hundreds of identical error messages from clogging the log.
-- Used for storing attachments for outgoing messages.
-- Should be cleaned out periodically.
create sequence wm_outgoing_msg_id_sequence;
create table wm_outgoing_messages (
outgoing_msg_id integer not null primary key,
body clob,
composed_message clob,
creation_date date default sysdate not null,
creation_user not null references users
);
create table wm_outgoing_headers (
outgoing_msg_id integer not null references wm_outgoing_messages on delete cascade,
name varchar(100) not null,
value varchar(4000),
sort_order integer not null
);
create unique index wm_outgoing_headers_idx on wm_outgoing_headers (outgoing_msg_id, name);
create sequence wm_outgoing_parts_sequence;
create table wm_outgoing_message_parts (
outgoing_msg_id integer not null references wm_outgoing_messages on delete cascade,
data blob,
filename varchar(600) not null,
content_type varchar(100), -- mime type of data
sort_order integer not null,
primary key (outgoing_msg_id, sort_order)
);
-- Create a job to clean up orphaned outgoing messages every day.
create or replace procedure wm_cleanup_outgoing_msgs as
begin
delete from wm_outgoing_messages
where creation_date < sysdate - 1;
end;
/
declare
job number;
begin
dbms_job.submit(job, 'wm_cleanup_outgoing_msgs;',
interval => 'sysdate + 1');
end;
/
When composing messages for sending, the unsent message and any
attachments are stored in the database. When the message is sent, a
MIME message is composed consisting of the text of the message
followed by any attachments (there is currently no facility to
intersperse attachments with text), and the data in these tables
deleted. If a user begins to compose a message, then neglects to send
it, the unsent message would remain in the database indefinitely. We
could handle this either by presenting a list of unsent messages to
the user and presenting the option of completing or deleting the
message, or by periodically cleaning out any old unsent
messages. Webmail does the latter.
Unsent outgoing attachments could as well be stored in the filesystem, but it is easier to manage them when they are all contained within the database.
-- PL/SQL bindings for Java procedures
create or replace procedure wm_process_queue (queuedir IN VARCHAR)
as language java
name 'com.arsdigita.mail.MessageParser.processQueue(java.lang.String)';
/
create or replace procedure wm_compose_message (outgoing_msg_id IN NUMBER)
as language java
name 'com.arsdigita.mail.MessageComposer.composeMimeMessage(int)';
/
-- Parse the queue every minute. Queue directory is hardcoded.
declare
job number;
begin
dbms_job.submit(job, 'wm_process_queue(''/home/nsadmin/qmail/queue/new'');',
interval => 'sysdate + 1/24/60');
end;
/
These PL/SQL bindings for Java procedures are the heart of the
system. wm_process_queue attempts to parse every file in
the given directory as an email message, deliver it to a webmail user,
and delete the file. It is run every minute by an Oracle job.
The argument to wm_process_queue here is one of the two
hardcoded paths in the webmail module. It must be changed if the
webmail user's maildir is created in a different location than that
shown.
This trigger makes deleting messages easy; deleting from-- Trigger to delete subsidiary rows when a message is deleted. create or replace trigger wm_messages_delete_trigger before delete on wm_messages for each row begin delete from wm_headers where msg_id = :old.msg_id; delete from wm_recipients where msg_id = :old.msg_id; delete from wm_message_mailbox_map where msg_id = :old.msg_id; delete from wm_attachments where msg_id = :old.msg_id; end; /
wm_messages will also delete the appropriate rows from
any subsidiary tables.
-- interMedia index on body of message
create index wm_ctx_index on wm_messages (body)
indextype is ctxsys.context parameters ('memory 250M');
-- INSO filtered interMedia index for attachments.
create index wm_att_ctx_index on wm_attachments (data)
indextype is ctxsys.context parameters ('memory 250M filter ctxsys.inso_filter format column format');
-- Trigger to update format column for INSO index.
create or replace trigger wm_att_format_tr before insert on wm_attachments
for each row
declare
content_type varchar(100);
begin
content_type := lower(:new.content_type);
if content_type like 'text/%' or content_type like 'application/msword%' then
:new.format := 'text';
else
:new.format := 'binary';
end if;
end;
/
-- Resync the interMedia index every hour.
declare
job number;
begin
dbms_job.submit(job, 'ctx_ddl.sync_index(''wm_ctx_index'');',
interval => 'sysdate + 1/24');
dbms_job.submit(job, 'ctx_ddl.sync_index(''wm_att_ctx_index'');',
interval => 'sysdate + 1/24');
end;
/
These indices and triggers enable full-text searches over messages. An
INSO filtered index is also created to allow full-text searches over
any attachments which contain text, including formatted documents.
domains/webmail/
--Domains may be created and deleted.
accounts
--Email accounts may be created or deleted.
messages
--Messages may be viewed, re-filed, or deleted.
composing messages
--New messages may be composed.
--Attachments may be added.
PL/SQL Procedures:
wm_process_queue (queuedir IN VARCHAR) Processes the mail queue directory and inserts messages into the database
(scheduled to run every minute by default)
wm_compose_message (outgoing_msg_id IN NUMBER) Given an outgoing_msg_id, updates the wm_outgoing_messages table
and sets the composed_message column to a complete message (including mail headers) which is ready to send.
proc wm_link_author { address } Adds an author filter.
proc wm_link_subject { subject } Adds a subject filter.
proc wm_header_display {msg_id header_display_style user_id} Procedure for displaying headers.
proc wm_quote_message { author msg_text } Quote text with "> " at the start of each line (for replying to messages).
proc wm_check_permissions { msg_id user_id } Checks if user may view message.
proc wm_get_mime_part { conn context } Gets the data from an attachment.
Jin Choi has written an excellent document on installation and configuration of the Webmail module. This covers configuring qmail, loading the data-model, the java files, and testing and configuring the system.
Future improvements include the ability to block senders, save drafts, save copies of sent messages, a more streamlined user-interface, custom signature files, and an IMAP and POP3 interface.