by Henry Minsky (hqm@arsdigita.com)
Submitted on: 2000-06-30
ArsDigita : ArsDigita Systems Journal : One article
"The Robustness Principle" -- Internet RFC 2015 and many others ...
Spurred by the popularity of HTML as a document interchange format and
aided by the MIME standards for multimedia-enriched email, some of
these same organizations would like to take advantage of the design
possibilities of HTML and its hyperlink facility to send more useful
and exciting messages to their members.
At first glance, nothing could seem simpler than sending HTML format
content to users by email. Just put your HTML content into a message
body, set the content type as "text/html," and send it. If you are
feeling virtuous, add a parameter of "charset=iso-8859-1" to the
content type. The miracle of Internet standards does the rest. The
recipient gets a beautiful, glossy web page delivered fresh into her
email inbox in the morning.
Unfortunately this is more difficult than it appears.
It turns out there are no Internet standards for
sending and displaying email that contains HTML code. This is why, for
example, you will rarely see relative URLs in HTML email hyperlinks
that you receive; in the absence of unambiguous standards for embedded
HTML, no one is sure how to make them work. The closest thing is the
MIME standard which describes how to encapsulate and identify the
message content types and encodings.
Most of the popular email reading clients have some capabilities to
display HTML. But the level of HTML they can render varies widely, from
the ability to run Internet Explorer as the rendering engine (Eudora
4.x) to being able to handle nothing more than perhaps a <P>
paragraph break and hyperlinks (AOL).
For example, ArsDigita client Away.com (http://www.away.com) sends a daily
email newsletter to around 600,000 subscribers. About 1,000 new users
register at the site every day, and more than half of them subscribe
to the newsletter on registering. The format of the newsletter is the
user's choice of either plain text or a MIME multipart/alternative
message, with a special version formatted for AOL users as described
below. The publisher made a policy decision to default the email type
to HTML for new subscribers, although that can be changed at the
user's preference.
As a result of this policy decision, almost all of users subscribed
get the HTML format newsletter. Based on the number of complaints
received by customer service over a several month period, we tried to
estimate the fraction of users who have problems reading the HTML mail
properly. If we assume that the fraction of users who would actually
bother to send a complaint if they have difficulty is just one in a
hundred, then we estimated that greater than 99.8% of the users were
able to read their HTML newsletters satisfactorily.
While your primary goal should be to form a message that as many users
as possible can view correctly, you will find you have to make
tradeoffs. For example, you may decide it is not worth trying to get
your mail to display correctly on WebTV, if that means restricting the
HTML features you use too severely. If there is a large population
tied to some platform with severe limitations, you may wish to send
out separately formatted email to them. The Away.com site sends out
newsletters to AOL users in a restricted subset of HTML which matches
the AOL email reader capabilities. Note that this kind of selective
formatting is really only feasible when you can easily distinguish the
target users, perhaps by matching their email domain names or by
convincing them to give you an explict preference.
In varying proportions you will find the following classes of
email clients, arranged here in a roughly observed order:
The Robustness Principle tells us to be conservative in what we
generate. Restrict yourself to the minumal set of HTML
directives you find necessary. Each fancy feature that you incorporate
(tables, embedded images or the trouble-prone JavaScript) will
inevitably cause some email reader to fail to render the
message properly. You need to weigh how much you want those fancy web
pages versus how many people you are willing to exclude from the
content.
The rest of this article will discuss some specific issues with encoding
of HTML content as legal MIME messages and suggest some procedures to
make this easier and more robust.
RFC-822 and its descendents have defined standard fields in the header
of the letter to allow programs to better process the messages. Since
the header information is only to provide information to the mail
reading program, a message can still be delivered without any header
at all even if it confuses the user by not having "from" and "to"
information.
One of the most important headers is the recipient's email address.
This may seem quite simple, but you really should make sure you are
sending to a valid email address.
In ArsDigita Community System installations with hundreds of thousands
of users, we have seen what seems like every possible ASCII string
entered as an email address. There are mostly correct ones, and then
obviously bogus ones like
Although we don't want to have the email system try to 'fix' email
addresses that are not standards-compliant, we recommend that the
system implementors take some effort to try to pre-screen or verify
users' email addresses when they are entered, so as to have a lower
incidence of ill-formed email addresses in the database. Stripping
whitespace is a simple heuristic to fix many of the user entry errors.
For email messages that go out to a wide audience, put an email
address in the From field that you expect to get a lot of replies to. The Reply-To
field is supported by most email clients these days,
so it is safe to include a Reply-To address that is separate from
your From address. However, we don't recommend this, because some email clients
will inevitably not respect the Reply-To field, and you will have people replying
to the From address. Or some people may not hit "reply," but copy the From address
maually when sending a reply.
Assume that the From
and Reply-To fields will be treated interchangeably by the user's mail reader,
and expect replies to one or the other to be equally likely.
The key point to understand is that mail system errors and notifications such as bounce
notices are sent back to the sender address in the envelope return path, not to the From
field of the message.
The built-in AOLserver mail routine This is vital to being able to automatically maintain a clean mailing list
with hundreds of thousands or millions of users.
The SMTP transport protocol [RFC-821] states, for maximum line length:
We've decided to send an email message with Content-Type of
"text/html." However, there are a couple of choices to be made as to
how we structure and encode this message.
The simplest structure for the message would be of the form:
Note: According to the RFC's:
I have seen examples of this kind of message sent in bulk mailings, such
as the American Express example at http://www.arsdigita.com/asj/mime/mime-examples/amex.txt.
There are two disadvantages to the simple structure and encoding methods used
above:
The MIME format allows you to create multipart messages, which
can contain multiple content parts with different content types.
For example, you can send an email message which contains a copy of
your newsletter in both "text/plain" and "text/html" formats.
Sending both plain-text and HTML versions of the message is a good option,
because it allows for a graceful degradation of the appearance of the
message for users whose email clients do not really support HTML but are
MIME-aware.
To send a multipart message, use the Content-Type "multipart." There
are a number of subtypes that can be used to modify this, but the two
we will consider are "multipart/mixed" or "multipart/alternative."
The multipart/mixed content type is used to assert that the message contains
several parts, all of which should be presented to the user.
The multipart/alternative asserts that the message contains several
representations of the same content, and the user's mail client should attempt to
show them the "best" one it can. In practice that means that a mail reader
with a text/plain and text/html part for a multipart/alternative message will
preferably display the text/html. If it is unable to display the text/html message,
it should gracefully degrade to a type it can render.
The following example of a multipart MIME message is taken from [RFC-2049]
The encoding of the contents of a MIME part are specified by the
Content-Transfer-Encoding header [RFC-2045]. The encodings you
can rely on working are "7bit," "8bit," "quoted-printable," or "base64."
The quoted-printable encoding is generally considered the best way to
encode HTML or other content which is primarily legal ASCII text.
Quoted-printable provides some protection against some of the
errors that can be introduced by MTAs along the way, such as deletion
of whitespace or truncation of characters' high bits. It also leaves
"vanilla" ASCII text alone for the most part, so the message is still
mostly readable even when encoded, which is a big help for debugging
mail transport errors.
Given all of the previous warnings about email client capabilities,
you might have some concern that there are mail readers
that cannot properly decode the quoted-printable encoding. However, it is
generally safe to assume that any mail client that can render HTML
correctly can probably decode quoted-printable correctly. In fact, if a mail
reader exists that can render HTML but cannot decode quoted-printable, the affected
users should probably upgrade immediately.
The [RFC-2045] has this to say about line lengths and QP encoding:
So the recommendation is to keep QP-encoded lines to less than 77 columns.
This is very good advice, you would be well-adviced to take it.
At one point, before
we started QP encoding the HTML, and the publisher was routinely including
lines of length 1000 or more, some recipients had trouble, usually from
an overly vigilant firewall virus detector. There are apparently a number
of security holes in Windows mail readers, and long line lengths in MIME
messages can be used as an exploit in some of them.
There are numerous finer points about encoding a MIME message
in a standards compliant way. They will not addressed here further.
Rather, the discussion of structuring a compliant
message will be deferred to the JavaMail section below.
Currently there is no supported standard for embeddeding inline images
into MIME HTML messages. There is a new proposal [RFC-2557] "MIME
Encapsulation of Aggregate Documents, such as HTML (MHTML)," but
it is not clear that any major email clients support it yet.
Another striking issue with sending images with every message is the excessive
bandwidth that will be used;
the images will usually contain far more data than the text portion.
What does tend to be
supported, though by no means universally, is plain-old IMG tags
using live URL links. That is, you can put an absolute URL in the
body of your HTML message, such as
Note however that this doesn't work if the user is offline! There
are users who have programs that dial up, grab their email, then
disconnect. So they will not see the pretty pictures in their
mail, and may in fact see ugly holes in the formatting and collapsed
layout where the images were supposed to go.
If you feel you must use inline images in your HTML mail,
remember also that every image will have to be retrieved from your
server. One ArsDigita client sent out 750,000 newsletters overnight,
each containing twenty or thirty images. The next morning, their
server was a lot less responsive than usual!
You should always provide a way for people to unsubscribe themselves from
a mailing list. They may have forgotten how they subscribed, or someone
may have maiciously subscribed them. It is best to make sure there are multiple
ways for the user to stop receiving the mail. The From and Reply-To
addresses should support email requests to unsubscribe. The message
content should have explicit instructions on how to unsubscribe as well, along
with an email address, URL, and maybe even a phone number. There is nothing
more frustrating than not being able to stop unwanted email from being sent to you!
The publisher should also provide an easy way for the recipients to
set their email type preference, i.e., plain-text or html mail. If
you want to be conservative, you can default to sending new users
only plain text unless they explicitly specify otherwise. If you default
new users to HTML content, make sure you have obvious instructions for
them to set their preferences to plain text content.
I would also encourage the publisher to add a link to a copy of the newsletter
content on their web site, so people whose email readers are hopelessly
inept can still view the content via a web browser.
You can write all the code to build compliant MIME messages yourself,
or you can try to find code that is already written which helps takes
care of the composition.
One option, following Jin Choi's Webmail example at http://www.arsdigita.com/asj/webmail/, is to use the JavaMail library
in the construction of a standards-compliant email message. The following
example Oracle/Java code constructs a multipart mime message containing plain text
and HTML parts. While the initial learning curve is somewhat steep (you need to
figure out how to load and call Java inside Oracle), it is
very nice to be able to offload the complexities of composing a message
onto a standard library. This way, if the MIME standard is enhanced or otherwise
changed, you will not have to rewrite much code.
The code below assumes that there is a database table
The message parts are encoded using quoted-printable encoding, and the
entire message is given a multipart/alternative content type. It is at this
point that you would also add directives to set the content-type
parameters. For example, you might want to specify a charset if you were sending content that was not ASCII or ISO-8859-1
compatible, such as Japanese text.
Consider this header, which encodes the subject field in two different
character sets. The MIME spec provides support for "encoded words"
for specifying character sets and encodings within strings in header
fields:
Perhaps not surprisingly, the greatest number of problems I have seen have
have been on Microsoft Outlook and Exchange. This may be due to the fact that
the user base for these programs is larger than for other email clients, or it may be due to the non-robust
nature of Microsoft software, especially in relation to Internet standards.
To illustrate some of the difficulties of debugging email viewing
problems from users, here are some real-life examples of bug reports
you can expect to receive. These examples are the entire bug
report messages, not just excerpts. You can see how much debugging
information the typical user will include in their reports.
The users
often do forward back a copy of the message with their mail, but it is
invariably so chewed up as to be practically unrecognizable. In
practically no cases have I ever gotten back a copy of the original
newsletter message that was was viewable in its intended form. The implication
is that most email systems that cannot display the message will also
transform it in a destructive way if the user tries to forward it.
Too bad because I was going to forward this to someone who may go to
Scotland this summer."
If you do not have the ability to transmit the appropriate communication
signals, please delete me from any further e-mails. Otherwise, I look
forward to your improved communications.
Thank you!"
Some mail readers can format simple HTML, but not tables, inline
images, or other fancy features. This is an argument in favor of
using a simplified subset of HTML when composing your messages.
The use of HTML in email messages is not yet a universally
supported standard. Thus, you cannot hope to make something that uses
the latest whiz-bang HTML formatting and is reliably readable on every
mail client. So you have to ask, for a given feature set, what is an
acceptable percentage of messages "unreadable" to customers to aim
for? 1%? 0.1%? It is really a judgement call for the publisher. What
the world needs is a clearinghouse of client capabilities so that programmers can know
what HTML subset is rendered acceptably on what fractions of users'
email clients. Without that knowledge, publishers should be
conservative about what they try to send as HTML.
I have two points that I wish had been covered in the article: First, what are the existing social conventions on the network about HTML in e-mail, and second, what do you gain by doing this? Both are critically important.
Certainly, a large number (perhaps even a significant majority) of users have clients capable of rendering HTML e-mail. But for those of us who, by choice or by no fault of our own, use clients that do not render HTML, publishers who choose to encode their e-mail run the risk of sending us junk -- which may eventually get your mail ignored.
Beyond the readability issue, the question is "how many users are you willing to piss off, and how thoroughly will you piss them off?" I've made it a habit to not return to sites (usually sites that want my money) if they send me HTML e-mail. I'm sufficiently reactionary to this stuff that I refuse to read e-mail sent in HTML format -- reactionary to the point that I actively filter for it. The number of spammers using HTML as an encoding method make this a particularly useful strategy for dealing with that problem; while it's probably killing some legitimate mail, I console myself with the knowledge that if they're rude enough to send HTML in the blind, I'm going to return the favor and not read their mail.
I'm probably at the extreme end of the spectrum, but I'm willing to bet that a lot of technically minded individuals -- people for whom using Outlook or Netscape is like pulling teeth -- feel exactly the same way about HTML in non-Web settings, whether it's e-mail or on Usenet. If you know that your target audience can read it, and, more importantly, is willing to read it as HTML, then by all means, send in HTML (as mediated by what I'm about to say). But if you don't know what your target audience is using, don't send in HTML. It's the same thing as not sending e-mail to people who haven't explicitly requested it -- it's rude if you do, and while there's no technical prohibition against doing it, there are numerous social conventions you'll be breaking. With all the discussion about personalization of Web services, it would be trivial to allow users to select whether they want to receive HTML e-mail or not -- assuming, of course, that the default was set to "no."
Aside from HTML itself, adding images to e-mail is a bad idea too. Consider the poor guy with a 28.8 dialup trying to load 30 in-line 25kb JPGs. He's not going to be too happy with you, particularly if he lives in the UK where people are billed for local calls. It's a thin end of the wedge problem, too: How long will it take before publishers start embedding streaming video in e-mail, and what happens to the poor guy on the 28.8 link then? He'll stop coming to your site, he'll filter your mail, and presto -- you've lost a visitor and, presumably, a customer.
This is a bad idea for the same reasons it's a bad idea to send unsolicited e-mail to your users. You may have their e-mail address (a resource), but they may not want to hear from you. Similarly, they may have bandwidth, but they don't want to let you use it to suck back those 30 in-line JPGs. My point so far? Unless you know what your users want and are willing to give you, stick to something very simple that doesn't consume more resources than it absolutely has to.
The other issue that wasn't addressed in the article was the question of what you gain by sending HTML e-mail. Most mail clients that can render HTML can also pick out URLs and turn them into hyperlinks on their own, so the idea of adding clickable links to e-mail becomes irrelevant. If the sole reason you're going to send in HTML is to include images, I would urge you to think really hard before doing it, and again consider what is gained by adding images to e-mail. Do you really need to include product shots? Do you need fancy formatting, backgrounds, fonts, and other chaff that merely "looks nice" but doesn't add anything to the message? What are you trying to communicate that would require the use of HTML (or images), or what do you want to do in e-mail that can't be better done elsewhere? What's the point behind sending images and fancy fonts? Fundamentally, does it communicate a different message from the one you'd be sending with plain text? In e-mail, more than on the Web, the content is what's important, not the presentation style. Minimalist e-mail can be beautiful, and perhaps more effective than maximalist multimedia presentations.
There's no right answer to this, obviously. It's a decision that each publisher is going to have to make independently from every other. But there are issues beyond the ones raised by Mr. Minsky, and publishers who are going to send their users e-mail need to be aware of them, lest they incur the wrath of readers.
I appreciate the blurb on inline images in HTML e-mail, and your warning of the problems that they bring. As it turns out, as of June 2000, many of the major mail clients (Netscape, MS Outlook) do in fact support RFC 2557; but, many web-based mail readers do not properly handle the "multipart/related" MIME content type or the funny "cid:" URIs in IMG tags that refer to attachments elsewhere in the MIME document (e.g., <img src="cid:part2.MAIL.ID@arsdigita.com">) The result is, if you attach in-line images, many mail readers might give up and revert to showing the plain text version rather than the HTML version, if you have one. To echo the previous commenter, think twice before putting in-line images in your HTML e-mail. It may really cause you and your users more trouble than it's worth.
My personal feelings are similar to his, in that I believe that email should of course never be sent unsolicited, and if the user does sign up, they should explicitly be able to choose the format, with a default of plain text. And I tried to be explicit about providing as many possible ways for people to figure out how to unsubscribe themselves as possible. The publishers have different goals, and are often insensitive to
the protocols and unwritten courtesies of the Internet communities.
I agree that often times HTML mail is just used to get "in your face"
rather than to provide additional function or ease of use. To Bill Schneider and others, thanks for your comments about specific
capabilities of mail readers. It is exactly this kind of information
I hoped to exchange with other developers. I feel like I have been
shooting in the dark somewhat when sending out millions of emails on behalf of publishers, and not knowing how many people were going to
be unable to correctly view the messages, or what the common problems
would be. I still see a fraction of unresolved problems, mostly with
Outlook on Windows not being able to select the hyperlinks. Never
reproducible on any of our systems of course. Adhering to the email and related standards is very important in this area, to prevent total
Microsoft-like chaos, but because
there are so many options and implementations, and constant pressure to make use of new MIME features, it is very hard to figure out
what features to try to use and expect to work. I hope people will
share their experiences, and make this a living document.
Nice article! I've just suffered a lot to code the html email newsletter with my site news (in portuguese).
Here some comments/tips:
Some examples:
One more nice trick I've seen done is to place an HTML comment early on in the HTML mail saying something like "<!-- Whoa! What's all this garbage in my email? If you can read this, then your email program probably doesn't support HTML-formatted messages. Here's how to make it better: ... -->" (instructions for getting plain-text mail follows). I'm also still leery of using HTML in email, especially given possible abuses with IMG tags for tracking readership, and rogue javascript.
* width of lines that will be rendered * fixed width font issues (is there some way to construct tabular data without fixed width fonts that's readable) * is there any convention that might hint to a mail reader that the content should be fixed width? * use of tab characters * mail client issues * rendering of hyperlinks (OE, for example, seems to NOT render them as clickable if you view the message in fixed width) * splitting of hyperlinks if they're too wide. How wide should my hyperlink be? * other mangling that might be done somewhere along the way * canned hints that might be included for the reader as to how best view the message etc. Does anyone have such a list or a pointer to one? Thanks,
Bob Sidebotham
Last updated: 2000-08-05
Implementor's note: It cannot be stressed enough that applications
using this standard should follow MIME's suggestion that you "be
conservative in what you generate, and liberal in what you accept."
In this particular case it means it would be wise for an
implementation to accept messages with any content-transfer-
encoding, but restrict generation to the 7-bit format required by
this memo. This will allow future compatibility in the event the
Internet SMTP framework becomes 8-bit friendly.
Many organizations would like to use email to provide services
and keep their members informed on a periodic basis.
The Nitty Gritty World of HTML Email
Who Is Reading Your Email, And With What?
Standards for Email Message Encoding, or What Was That RFC Again?
An SMTP email message is composed of three parts
Creating an email message requires some headers and some content. Let's
look at the headers first.
The To: Field
*#$'12828 xxasdfM, and then
there are the ones which look like they could work if you massaged
them a little like "Mary Smith @ cnn .com."
The From: and Reply-To: Fields
[RFC-822] has this to say about the From: field
4.4.1. FROM / RESENT-FROM
This field contains the identity of the person(s) who wished
this message to be sent. The message-creation process should
default this field to be a single, authenticated machine
address, indicating the AGENT (person, system or process)
entering the message. If this is not done, the "Sender" field
MUST be present. If the "From" field IS defaulted this way,
the "Sender" field is optional and is redundant with the
"From" field. In all cases, addresses in the "From" field
must be machine-usable (addr-specs) and may not contain named
lists (groups).
Special Note on Envelope Return Paths
Many people are confused about the differences between the From field
and the envelope return-path. The issue arises when you want to know
what happens when an email message bounces.
ns_sendmail
treats the message From field and the envelope sender as the same
address, but that is not what you want for real automated
mail-handling production systems. The ACS bulkmail package, for
example, creates a special unique sender address for each outgoing
message which contains encoded information about to whom the mail was
sent and from which module and mailing run. That way the system can
automatically and unambiguously parse returned mail and match it with
the user email address it was sent to. It can then do useful things
like updating the user's email_bouncing_p flag in the
database.
SMTP Compliance
There isn't much you have to worry about in terms of SMTP compliance; that
should all be taken care of by your email sending routine. However it
is worth noting the following.
So they are saying keep your lines under 1000 characters in
length. However they also say that implementors of MTAs who make this
as a built-in limit are being stupid. In practice, you can probably
send arbitrarily long lines to most email systems. However some may
give errors if you do. Some modern firewall-based virus detectors can
be triggered by overly long lines. If you must have very long lines,
use quoted-printable encoding, or some other form of content encoding.
text line
The maximum total length of a text line including the
<CRLF> is 1000 characters (but not counting the leading
dot duplicated for transparency).
****************************************************
* *
* TO THE MAXIMUM EXTENT POSSIBLE, IMPLEMENTATION *
* TECHNIQUES WHICH IMPOSE NO LIMITS ON THE LENGTH *
* OF THESE OBJECTS SHOULD BE USED. *
* *
****************************************************
MIME Headers and Encoding
In order to send a MIME message, the standards say you must use at least the following
headers:
Building a Simple MIME Message
To: Mary_Smith@foo.com
Subject: Great Deals on English Muffins
From: info@bar.com
MIME-Version: 1.0
Content-Type: text/html; charset="us-ascii"
<h1>Great Deals<h1>
There are some <i>great deals</i> on English Muffins
today.
Content-type: text/plain; charset=us-ascii (comment)
and
Content-type: text/plain; charset="us-ascii"
are completely equivalent.
Multipart MIME Messages
Assume we are a sending a multipart/alternative message. We still get
a choice of how to encode the content in each part, and which order to
put the parts in the mail message.
MIME-Version: 1.0
From: Nathaniel Borenstein
(5) (Soft Line Breaks) The Quoted-Printable encoding
REQUIRES that encoded lines be no more than 76
characters long. If longer lines are to be encoded
with the Quoted-Printable encoding, "soft" line breaks
must be used. An equal sign as the last character on a
encoded line indicates such a non-significant ("soft")
line break in the encoded text.
Embedded Images in HTML
One of the first things that publishers seem to want to do is put
images into the HTML they send in their email. This opens up a host
of issues and problems.
<IMG src="http://www.techrepublic.com/images/trlogo94_60.gif">
and many email readers will fetch the image and render it inline when
the user viewing the message.
Other Considerations
While not part of the encoding process, you should consider some other issues
with the content of your newsletters.
Letting Someone Else Do The Hard Work
spam_history
with a row containing the plain text and HTML versions of the message to be
sent. The code uses the JavaMail API to construct a MIME message and then
inserts the complete message back into the database. Since
this message is designed to go out to millions of users, it is actually
constructed as a template, with the To: and Reply-To: headers containing
placeholder values. This message template can then be passed to a
bulk mailer module that will efficiently send it to a large mailing-list.
Sending Email Directly From Java
Note: you could actually send this mail directly from Java, using the JavaMail
Transport API. Example code to do this is at http://www.arsdigita.com/asj/mime/java-send. It
is not clear that this is something you want to do for high volume mailings
using the default JavaMail transport code, however.
// SpamMessageComposer.sqlj
// originally part of the webmail ACS module
// written by Jin Choi
The Oracle PL/SQL wrapper for this looks like
create or replace procedure spam_test_message (spam_id IN NUMBER)
as language java
name 'com.arsdigita.mail.SpamMessageComposer.composeHTMLMimeMessage(int)';
/
call spam_test_message(3677)
/
International Character Set Encodings
Given that the Internet is a global community, you may want to send
email in other character set encodings than US-ASCII or ISO-8859-1.
For message body content, you should generally only need to add a charset parameter to the
Content-Type header. However encoding of non-US charset info in the headers
can be somewhat more involved.
The headers above show examples of encoding strings in US-ASCII, ISO-8859-1, and ISO-8859-2,
using the Q (quoted-printable) and B (binary) encodings.
For more information see [I18N-MAIL], i18n and Multilingual support in Internet mail, at http://www.terena.nl/multiling/ml-mua/mldoc-review.html.
From: =?US-ASCII?Q?Keith_Moore?=
Analyzing What Went Wrong
When a user reports that the newsletter "is broken", it is often
remarkably difficult to figure out what is going on. Email readers
can do so much silent damage to a message when trying to display it
that it is often impossible to figure out what they are finally seeing
in their mail reader window. Many users have no idea how their email
works, and thus cannot describe to you a reasonable model of what may
be happening. They simply see something incomprehensible on their
screen. Other times the reports are somewhat succint, and indicate
that the mail client refuses to launch a browser when a hyperlink is
clicked, indicating that at least the links are displaying, although
they may be corrupted in some way. At least with some of the webmail
services, it is easy to verify if they can correctly handle a MIME
HTML enclosure, whereas if someone is using Lotus Notes on a Windows
3.1 machine it is pretty hopeless trying to help them. The best thing
is to tell them to switch to the plain text version of the newsletter
(which you are providing, right?)
"I can never read your e-mails - is there some way to make them so I can
read them?"
Often it
is next to impossible to figure out where the difficulty might be
arising. When trying to debug the situation, one approach is to ask the user
user "Are there any other HTML newsletters which you receive correctly?"
to which the answer is often "no". In this case it is probably
a problem with their mail reader's inability to format HTML, rather than our MIME
encoding of the messages. Sometimes they say yes, but it turns out they
are receiving mail with a subset of HTML which has no images or no tables.
"Why is the writing in the e-mail so small? Please enlarge the
articles printing."
"To whom it may concern, Unfortunately I'm unable to open your sites, that
you send me daily
Any assistance would be appreciated"
"Your email is coming out as HTML code.
"For some reason, I'm not receiving this properly (see below)......."
"
Hi,
I tried to download the image & your links
don't work (any of them).
"
"Is this the way this is supposed to look?"
"Please advise....I've recieved your Daily Escape for months and months
through my email address at xxx@yyy.net and always recieved
beautiful and interesting photographs. HOWEVER, since I've switched to
ComuServe I re-registered with you for the Daily Escapes to be sent to my
new
email address at: xxx@xx.com and am not getting photos with the
Daiily Escape. Did I sign up incorrectly or ask for the wrong subscription?
Help please Thanks
"I receive you e-mails with instructions to click on the underlined blue
highlighted words; however when I do, nothing happens. Are you aware of
this fact?
"Hi. I have not been able to click on anything in the past few messages I
received from you. Certain things are underlined in blue or say click here
for more details but I can't. Is there anything you can do to help?
Thanks"
Examples
You can find some real-world examples of HTML format mail that I have
received at http://www.arsdigita.com/asj/mime/mime-examples/.
Note the wide spectrum of encoding methods used. It is hard to say which
of these formats is the most likely to be readable on the maximum
number of mail clients, but it is interesting to note the spectrum of
MIME encoding features used (e.g., QP vs 7bit, multipart vs single
part).
Final Notes
References
[US-ASCII] Coded Character Set--7-Bit American Standard Code for
Information Interchange, ANSI X3.4-1986.
[ISO-2022] International Standard--Information Processing--ISO 7-bit
and 8-bit coded character sets--Code extension techniques, ISO
2022:1986.
[ISO-8859] Information Processing -- 8-bit Single-Byte Coded Graphic
Character Sets -- Part 1: Latin Alphabet No. 1, ISO 8859-1:1987. Part
2: Latin alphabet No. 2, ISO 8859-2, 1987. Part 3: Latin alphabet
No. 3, ISO 8859-3, 1988. Part 4: Latin alphabet No. 4, ISO 8859-4,
1988. Part 5: Latin/Cyrillic alphabet, ISO 8859-5, 1988. Part 6:
Latin/Arabic alphabet, ISO 8859-6, 1987. Part 7: Latin/Greek
alphabet, ISO 8859-7, 1987. Part 8: Latin/Hebrew alphabet, ISO
8859-8, 1988. Part 9: Latin alphabet No. 5, ISO 8859-9, 1990.
[ISO-646] International Standard--Information Processing--ISO 7-bit
coded character set for information interchange, ISO 646:1983.
[X400] Schicker, Pietro, "Message Handling Systems, X.400", Message
Handling Systems and Distributed Applications, E. Stefferud, O-j.
Jacobsen, and P. Schicker, eds., North-Holland, 1989, pp. 3-41.
[I18N-MAIL] (http://www.terena.nl/multiling/ml-mua/mldoc-review.html) Yuri Demchenko, TERENA
asj-editors@arsdigita.com
Reader's Comments
I was a little surprised to see an article like this come from someone with an @arsdigita.com address, but read it hoping that this would be a fair treatment of the subject, with both pros and cons. Surprise! No such luck. Mr. Minsky seems to have focused exclusively on the technical aspects of sending HTML e-mail to people (presumably folks who have requested it), and completely ignored the other side of the question -- the social consequences of sending HTML e-mail. This was a fine examination of the technical aspects of sending HTML e-mail -- it just didn't look at the whole picture, and indeed, it suggests there isn't much else to look at.
-- Mike Sugimoto, June 9, 2000
Henry,
-- Bill Schneider, June 22, 2000
I think that Mike Sugimoto raises some excellent questions, and that
publishers would do very well to pay heed to his strong feelings on
the subject.
-- Henry Minsky, June 28, 2000
That's it.
-- Paulo Eduardo Neves, August 31, 2000
In several years of e-mail admin I have to say that MS e-mail clients are both the most error-prone and the least error-tolerant that I have encountered. This means that they are more likely to have problems with each other's messages than cause problems for others.
All the *nix mail clients I have used are robust enough to ignore these errors. Its the MS clients that can't cope with the errors caused by MS clients.
[comment quoted from discussion on Slashdot with permission of author (itsbruce@netscape.net.REMOVE)]
-- Henry Minsky, September 23, 2000
Thanks for the very thorough article -- it's exactly what I was looking for. The other readers' comments are helpful too.
-- Steve Yost, October 4, 2000
I wouldn't mind seeing a similar discussion to this one restricted to rules for formatting plain text messages. In particular, there seem to be issues around:
-- Bob Sidebotham, October 20, 2000Related Links