|
||||||||||
PREV CLASS NEXT CLASS | FRAMES NO FRAMES | |||||||||
SUMMARY: INNER | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD |
java.lang.Object | +--javax.mail.internet.MimeUtility
This is a utility class that provides various MIME related functionality.
There are a set of methods to encode and decode MIME headers as per RFC 2047. A brief description on handling such headers is given below:
RFC 822 mail headers must contain only US-ASCII characters. Headers that contain non US-ASCII characters must be encoded so that they contain only US-ASCII characters. Basically, this process involves using either BASE64 or QP to encode certain characters. RFC 2047 describes this in detail.
In Java, Strings contain (16 bit) Unicode characters. ASCII is a subset of Unicode (and occupies the range 0 - 127). A String that contains only ASCII characters is already mail-safe. If the String contains non US-ASCII characters, it must be encoded. An additional complexity in this step is that since Unicode is not yet a widely used charset, one might want to first charset-encode the String into another charset and then do the transfer-encoding.
Note that to get the actual bytes of a mail-safe String (say, for sending over SMTP), one must do
byte[] bytes = string.getBytes("iso-8859-1");
The setHeader()
and addHeader()
methods
on MimeMessage and MimeBodyPart assume that the given header values
are Unicode strings that contain only US-ASCII characters. Hence
the callers of those methods must insure that the values they pass
do not contain non US-ASCII characters. The methods in this class
help do this.
The getHeader()
family of methods on MimeMessage and
MimeBodyPart return the raw header value. These might be encoded
as per RFC 2047, and if so, must be decoded into Unicode Strings.
The methods in this class help to do this.
Field Summary | |
static int |
ALL
|
(package private) static int |
ALL_ASCII
|
(package private) static int |
MOSTLY_ASCII
|
(package private) static int |
MOSTLY_NONASCII
|
Method Summary | |
(package private) static void |
|
(package private) static int |
checkAscii(byte[] b)
Check if the given byte array contains non US-ASCII characters. |
(package private) static int |
checkAscii(InputStream is,
int max,
boolean breakOnNonAscii)
Check if the given input stream contains non US-ASCII characters. |
(package private) static int |
checkAscii(String s)
Check if the given string contains non US-ASCII characters. |
static InputStream |
decode(InputStream is,
String encoding)
Decode the given input stream. |
static String |
decodeText(String etext)
Decode "unstructured" headers, that is, headers that are defined as '*text' as per RFC 822. |
static String |
decodeWord(String eword)
The string is parsed using the rules in RFC 2047 for parsing an "encoded-word". |
static OutputStream |
encode(OutputStream os,
String encoding)
Wrap an encoder around the given output stream. |
static String |
encodeText(String text)
Encode a RFC 822 "text" token into mail-safe form as per RFC 2047. |
static String |
encodeText(String text,
String charset,
String encoding)
Encode a RFC 822 "text" token into mail-safe form as per RFC 2047. |
static String |
encodeWord(String word)
Encode a RFC 822 "word" token into mail-safe form as per RFC 2047. |
static String |
encodeWord(String word,
String charset,
String encoding)
Encode a RFC 822 "word" token into mail-safe form as per RFC 2047. |
static String |
getDefaultJavaCharset()
Get the default charset corresponding to the system's current default locale. |
(package private) static String |
getDefaultMIMECharset()
|
(package private) static String |
getEncoding(DataHandler dh)
Same as getEncoding(DataSource) except that instead
of reading the data from an InputStream it uses the
writeTo method to examine the data. |
static String |
getEncoding(DataSource ds)
Get the content-transfer-encoding that should be applied to the input stream of this datasource, to make it mailsafe. |
static String |
javaCharset(String charset)
Convert a MIME charset name into a valid Java charset name. |
static String |
mimeCharset(String charset)
Convert a java charset into its MIME charset name. |
static String |
quote(String word,
String specials)
A utility method to quote a word, if the word contains any characters from the specified 'specials' list. |
Methods inherited from class java.lang.Object |
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait |
Field Detail |
public static final int ALL
static final int ALL_ASCII
static final int MOSTLY_ASCII
static final int MOSTLY_NONASCII
Method Detail |
public static String getEncoding(DataSource ds)
The algorithm used here is:
ds
- DataSourcestatic String getEncoding(DataHandler dh)
getEncoding(DataSource)
except that instead
of reading the data from an InputStream
it uses the
writeTo
method to examine the data. This is more
efficient in the common case of a DataHandler
created with an object and a MIME type (for example, a
"text/plain" String) because all the I/O is done in this
thread. In the case requiring an InputStream
the
DataHandler
uses a thread, a pair of pipe streams,
and the writeTo
method to produce the data. XXX - This should be made public in JavaMail 1.2.
public static InputStream decode(InputStream is, String encoding) throws MessagingException
is
- input streamencoding
- the encoding of the stream.public static OutputStream encode(OutputStream os, String encoding) throws MessagingException
os
- output streamencoding
- the encoding of the stream.public static String encodeText(String text) throws UnsupportedEncodingException
The given Unicode string is examined for non US-ASCII characters. If the string contains only US-ASCII characters, it is returned as-is. If the string contains non US-ASCII characters, it is first character-encoded using the platform's default charset, then transfer-encoded using either the B or Q encoding. The resulting bytes are then returned as a Unicode string containing only ASCII characters.
Note that this method should be used to encode only "unstructured" RFC 822 headers.
Example of usage:
MimePart part = ... String rawvalue = "FooBar Mailer, Japanese version 1.1" try { // If we know for sure that rawvalue contains only US-ASCII // characters, we can skip the encoding part part.setHeader("X-mailer", MimeUtility.encodeText(rawvalue)); } catch (UnsupportedEncodingException e) { // encoding failure } catch (MessagingException me) { // setHeader() failure }
text
- unicode stringUnsupportedEncodingException
- if the encoding failspublic static String encodeText(String text, String charset, String encoding) throws UnsupportedEncodingException
The given Unicode string is examined for non US-ASCII characters. If the string contains only US-ASCII characters, it is returned as-is. If the string contains non US-ASCII characters, it is first character-encoded using the specified charset, then transfer-encoded using either the B or Q encoding. The resulting bytes are then returned as a Unicode string containing only ASCII characters.
Note that this method should be used to encode only "unstructured" RFC 822 headers.
text
- the header valuecharset
- the charset. If this parameter is null, the
platform's default chatset is used.encoding
- the encoding to be used. Currently supported
values are "B" and "Q". If this parameter is null, then
the "Q" encoding is used if most of characters to be
encoded are in the ASCII charset, otherwise "B" encoding
is used.public static String decodeText(String etext) throws UnsupportedEncodingException
The string is decoded using the algorithm specified in RFC 2047, Section 6.1.1. If the charset-conversion fails for any sequence, an UnsupportedEncodingException is thrown. If the String is not an RFC 2047 style encoded header, it is returned as-is
Example of usage:
MimePart part = ... String rawvalue = null; String value = null; try { if ((rawvalue = part.getHeader("X-mailer")[0]) != null) value = MimeUtility.decodeText(rawvalue); } catch (UnsupportedEncodingException e) { // Don't care value = rawvalue; } catch (MessagingException me) { } return value;
etext
- the possibly encoded valueUnsupportedEncodingException
- if the charset
conversion failed.public static String encodeWord(String word) throws UnsupportedEncodingException
The given Unicode string is examined for non US-ASCII characters. If the string contains only US-ASCII characters, it is returned as-is. If the string contains non US-ASCII characters, it is first character-encoded using the platform's default charset, then transfer-encoded using either the B or Q encoding. The resulting bytes are then returned as a Unicode string containing only ASCII characters.
This method is meant to be used when creating RFC 822 "phrases". The InternetAddress class, for example, uses this to encode it's 'phrase' component.
text
- unicode stringUnsupportedEncodingException
- if the encoding failspublic static String encodeWord(String word, String charset, String encoding) throws UnsupportedEncodingException
The given Unicode string is examined for non US-ASCII characters. If the string contains only US-ASCII characters, it is returned as-is. If the string contains non US-ASCII characters, it is first character-encoded using the specified charset, then transfer-encoded using either the B or Q encoding. The resulting bytes are then returned as a Unicode string containing only ASCII characters.
text
- unicode stringcharset
- the MIME charsetencoding
- the encoding to be used. Currently supported
values are "B" and "Q". If this parameter is null, then
the "Q" encoding is used if most of characters to be
encoded are in the ASCII charset, otherwise "B" encoding
is used.UnsupportedEncodingException
- if the encoding failspublic static String decodeWord(String eword) throws ParseException, UnsupportedEncodingException
eword
- the possibly encoded valueParseException
- if the string is not an
encoded-word as per RFC 2047.UnsupportedEncodingException
- if the charset
conversion failed.public static String quote(String word, String specials)
The HeaderTokenizer
class defines two special
sets of delimiters - MIME and RFC 822.
This method is typically used during the generation of RFC 822 and MIME header fields.
word
- word to be quotedspecials
- the set of special charactersHeaderTokenizer.MIME
,
HeaderTokenizer.RFC822
public static String javaCharset(String charset)
charset
- the MIME charset namepublic static String mimeCharset(String charset)
Note that a future version of JDK (post 1.2) might provide this functionality, in which case, we may deprecate this method then.
charset
- the JDK charsetpublic static String getDefaultJavaCharset()
static String getDefaultMIMECharset()
static void()
static int checkAscii(String s)
s
- stringstatic int checkAscii(byte[] b)
b
- byte arraystatic int checkAscii(InputStream is, int max, boolean breakOnNonAscii)
max
bytes are checked. If max
is
set to ALL
, then all the bytes available in this
input stream are checked. If breakOnNonAscii
is true
the check terminates when the first non-US-ASCII character is
found and MOSTLY_NONASCII is returned. Else, the check continues
till max
bytes or till the end of stream.is
- the input streammax
- maximum bytes to check for. The special value
ALL indicates that all the bytes in this input
stream must be checked.breakOnNonAscii
- if true
, then terminate the
the check when the first non-US-ASCII character
is found.
|
||||||||||
PREV CLASS NEXT CLASS | FRAMES NO FRAMES | |||||||||
SUMMARY: INNER | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD |