email.message
: Representing an email messageSource code: Lib/email/message.py
[UNKNOWN NODE transition]New in version 3.6: 1
The central class in the email
package is the EmailMessage
class, imported from the email.message
module. It is the base class for
the email
object model. EmailMessage
provides the core
functionality for setting and querying header fields, for accessing message
bodies, and for creating or modifying structured messages.
An email message consists of headers and a payload (which is also referred
to as the content). Headers are RFC 5322 or RFC 6532 style field names
and values, where the field name and value are separated by a colon. The colon
is not part of either the field name or the field value. The payload may be a
simple text message, or a binary object, or a structured sequence of
sub-messages each with their own set of headers and their own payload. The
latter type of payload is indicated by the message having a MIME type such as
multipart/*
or message/rfc822
.
The conceptual model provided by an EmailMessage
object is that of an
ordered dictionary of headers coupled with a payload that represents the
RFC 5322 body of the message, which might be a list of sub-EmailMessage
objects. In addition to the normal dictionary methods for accessing the header
names and values, there are methods for accessing specialized information from
the headers (for example the MIME content type), for operating on the payload,
for generating a serialized version of the message, and for recursively walking
over the object tree.
The EmailMessage
dictionary-like interface is indexed by the header
names, which must be ASCII values. The values of the dictionary are strings
with some extra methods. Headers are stored and returned in case-preserving
form, but field names are matched case-insensitively. Unlike a real dict,
there is an ordering to the keys, and there can be duplicate keys. Additional
methods are provided for working with headers that have duplicate keys.
The payload is either a string or bytes object, in the case of simple message
objects, or a list of EmailMessage
objects, for MIME container
documents such as multipart/*
and message/rfc822
message objects.
class email.message.EmailMessage(policy=default)
If policy is specified use the rules it specifies to update and serialize
the representation of the message. If policy is not set, use the
default
policy, which follows the rules of the email
RFCs except for line endings (instead of the RFC mandated \r\n
, it uses
the Python standard \n
line endings). For more information see the
policy
documentation.
as_string(unixfrom=False, maxheaderlen=None, policy=None)
Return the entire message flattened as a string. When optional
unixfrom is true, the envelope header is included in the returned
string. unixfrom defaults to False
. For backward compatibility
with the base Message
class maxheaderlen is
accepted, but defaults to None
, which means that by default the line
length is controlled by the
max_line_length
of the policy. The
policy argument may be used to override the default policy obtained
from the message instance. This can be used to control some of the
formatting produced by the method, since the specified policy will be
passed to the Generator
.
Flattening the message may trigger changes to the EmailMessage
if defaults need to be filled in to complete the transformation to a
string (for example, MIME boundaries may be generated or modified).
Note that this method is provided as a convenience and may not be the
most useful way to serialize messages in your application, especially if
you are dealing with multiple messages. See
email.generator.Generator
for a more flexible API for
serializing messages. Note also that this method is restricted to
producing messages serialized as "7 bit clean" when
utf8
is False
, which is the default.
Changed in version 3.6: the default behavior when maxheaderlen is not specified was changed from defaulting to 0 to defaulting to the value of max_line_length from the policy.
__str__()
Equivalent to [UNKNOWN NODE title_reference]. Allows
str(msg)
to produce a string containing the serialized message in a
readable format.
Changed in version 3.4: the method was changed to use utf8=True
,
thus producing an RFC 6531-like message representation, instead of
being a direct alias for as_string()
.
as_bytes(unixfrom=False, policy=None)
Return the entire message flattened as a bytes object. When optional
unixfrom is true, the envelope header is included in the returned
string. unixfrom defaults to False
. The policy argument may be
used to override the default policy obtained from the message instance.
This can be used to control some of the formatting produced by the
method, since the specified policy will be passed to the
BytesGenerator
.
Flattening the message may trigger changes to the EmailMessage
if defaults need to be filled in to complete the transformation to a
string (for example, MIME boundaries may be generated or modified).
Note that this method is provided as a convenience and may not be the
most useful way to serialize messages in your application, especially if
you are dealing with multiple messages. See
email.generator.BytesGenerator
for a more flexible API for
serializing messages.
__bytes__()
Equivalent to as_bytes()
. Allows bytes(msg)
to produce a
bytes object containing the serialized message.
is_multipart()
Return True
if the message's payload is a list of sub-EmailMessage
objects, otherwise return False
. When
is_multipart()
returns False
, the payload should be a string
object (which might be a CTE encoded binary payload). Note that
is_multipart()
returning True
does not necessarily mean that
"msg.get_content_maintype() == 'multipart'" will return the True
.
For example, is_multipart
will return True
when the
EmailMessage
is of type message/rfc822
.
set_unixfrom(unixfrom)
Set the message's envelope header to unixfrom, which should be a
string. (See mboxMessage
for a brief description of
this header.)
get_unixfrom()
Return the message's envelope header. Defaults to None
if the
envelope header was never set.
The following methods implement the mapping-like interface for accessing the
message's headers. Note that there are some semantic differences
between these methods and a normal mapping (i.e. dictionary) interface. For
example, in a dictionary there are no duplicate keys, but here there may be
duplicate message headers. Also, in dictionaries there is no guaranteed
order to the keys returned by keys()
, but in an EmailMessage
object, headers are always returned in the order they appeared in the
original message, or in which they were added to the message later. Any
header deleted and then re-added is always appended to the end of the
header list.
These semantic differences are intentional and are biased toward convenience in the most common use cases.
Note that in all cases, any envelope header present in the message is not included in the mapping interface.
__len__()
Return the total number of headers, including duplicates.
__contains__(name)
Return true if the message object has a field named name. Matching is
done without regard to case and name does not include the trailing
colon. Used for the in
operator. For example:
if 'message-id' in myMessage:
print('Message-ID:', myMessage['message-id'])
__getitem__(name)
Return the value of the named header field. name does not include the
colon field separator. If the header is missing, None
is returned; a
KeyError
is never raised.
Note that if the named field appears more than once in the message's
headers, exactly which of those field values will be returned is
undefined. Use the get_all()
method to get the values of all the
extant headers named name.
Using the standard (non-compat32
) policies, the returned value is an
instance of a subclass of email.headerregistry.BaseHeader
.
__setitem__(name, val)
Add a header to the message with field name name and value val. The field is appended to the end of the message's existing headers.
Note that this does not overwrite or delete any existing header with the same name. If you want to ensure that the new header is the only one present in the message with field name name, delete the field first, e.g.:
del msg['subject']
msg['subject'] = 'Python roolz!'
If the policy
defines certain headers to be unique (as the standard
policies do), this method may raise a ValueError
when an attempt
is made to assign a value to such a header when one already exists. This
behavior is intentional for consistency's sake, but do not depend on it
as we may choose to make such assignments do an automatic deletion of the
existing header in the future.
__delitem__(name)
Delete all occurrences of the field with name name from the message's headers. No exception is raised if the named field isn't present in the headers.
keys()
Return a list of all the message's header field names.
values()
Return a list of all the message's field values.
items()
Return a list of 2-tuples containing all the message's field headers and values.
get(name, failobj=None)
Return the value of the named header field. This is identical to
__getitem__()
except that optional failobj is returned if the
named header is missing (failobj defaults to None
).
Here are some additional useful header related methods:
get_all(name, failobj=None)
Return a list of all the values for the field named name. If there are
no such named headers in the message, failobj is returned (defaults to
None
).
add_header(_name, _value, **_params)
Extended header setting. This method is similar to __setitem__()
except that additional header parameters can be provided as keyword
arguments. _name is the header field to add and _value is the
primary value for the header.
For each item in the keyword argument dictionary _params, the key is
taken as the parameter name, with underscores converted to dashes (since
dashes are illegal in Python identifiers). Normally, the parameter will
be added as key="value"
unless the value is None
, in which case
only the key will be added.
If the value contains non-ASCII characters, the charset and language may
be explicitly controlled by specifying the value as a three tuple in the
format (CHARSET, LANGUAGE, VALUE)
, where CHARSET
is a string
naming the charset to be used to encode the value, LANGUAGE
can
usually be set to None
or the empty string (see RFC 2231 for other
possibilities), and VALUE
is the string value containing non-ASCII
code points. If a three tuple is not passed and the value contains
non-ASCII characters, it is automatically encoded in RFC 2231 format
using a CHARSET
of utf-8
and a LANGUAGE
of None
.
Here is an example:
msg.add_header('Content-Disposition', 'attachment', filename='bud.gif')
This will add a header that looks like
Content-Disposition: attachment; filename="bud.gif"
An example of the extended interface with non-ASCII characters:
msg.add_header('Content-Disposition', 'attachment',
filename=('iso-8859-1', '', 'Fußballer.ppt'))
replace_header(_name, _value)
Replace a header. Replace the first header found in the message that
matches _name, retaining header order and field name case of the
original header. If no matching header is found, raise a
KeyError
.
get_content_type()
Return the message's content type, coerced to lower case of the form
maintype/subtype
. If there is no Content-Type
header in the message return the value returned by
get_default_type()
. If the Content-Type
header is
invalid, return text/plain
.
(According to RFC 2045, messages always have a default type,
get_content_type()
will always return a value. RFC 2045 defines
a message's default type to be text/plain
unless it appears
inside a multipart/digest
container, in which case it would
be message/rfc822
. If the Content-Type
header
has an invalid type specification, RFC 2045 mandates that the default
type be text/plain
.)
get_content_maintype()
Return the message's main content type. This is the maintype
part of the string returned by get_content_type()
.
get_content_subtype()
Return the message's sub-content type. This is the subtype
part of the string returned by get_content_type()
.
get_default_type()
Return the default content type. Most messages have a default content
type of text/plain
, except for messages that are subparts of
multipart/digest
containers. Such subparts have a default
content type of message/rfc822
.
set_default_type(ctype)
Set the default content type. ctype should either be
text/plain
or message/rfc822
, although this is
not enforced. The default content type is not stored in the
Content-Type
header, so it only affects the return value of
the get_content_type
methods when no Content-Type
header is present in the message.
set_param(param, value, header='Content-Type', requote=True, charset=None, language='', replace=False)
Set a parameter in the Content-Type
header. If the
parameter already exists in the header, replace its value with value.
When header is Content-Type
(the default) and the header does not
yet exist in the message, add it, set its value to
text/plain
, and append the new parameter value. Optional
header specifies an alternative header to Content-Type
.
If the value contains non-ASCII characters, the charset and language may
be explicitly specified using the optional charset and language
parameters. Optional language specifies the RFC 2231 language,
defaulting to the empty string. Both charset and language should be
strings. The default is to use the utf8
charset and None
for
the language.
If replace is False
(the default) the header is moved to the
end of the list of headers. If replace is True
, the header
will be updated in place.
Use of the requote parameter with EmailMessage
objects is
deprecated.
Note that existing parameter values of headers may be accessed through
the params
attribute of the
header value (for example, msg['Content-Type'].params['charset']
.
Changed in version 3.4: replace
keyword was added.
del_param(param, header='content-type', requote=True)
Remove the given parameter completely from the Content-Type
header. The header will be re-written in place without the parameter or
its value. Optional header specifies an alternative to
Content-Type
.
Use of the requote parameter with EmailMessage
objects is
deprecated.
get_filename(failobj=None)
Return the value of the filename
parameter of the
Content-Disposition
header of the message. If the header
does not have a filename
parameter, this method falls back to looking
for the name
parameter on the Content-Type
header. If
neither is found, or the header is missing, then failobj is returned.
The returned string will always be unquoted as per
email.utils.unquote()
.
get_boundary(failobj=None)
Return the value of the boundary
parameter of the
Content-Type
header of the message, or failobj if either
the header is missing, or has no boundary
parameter. The returned
string will always be unquoted as per email.utils.unquote()
.
set_boundary(boundary)
Set the boundary
parameter of the Content-Type
header to
boundary. set_boundary()
will always quote boundary if
necessary. A HeaderParseError
is raised if the
message object has no Content-Type
header.
Note that using this method is subtly different from deleting the old
Content-Type
header and adding a new one with the new
boundary via add_header()
, because set_boundary()
preserves
the order of the Content-Type
header in the list of
headers.
get_content_charset(failobj=None)
Return the charset
parameter of the Content-Type
header,
coerced to lower case. If there is no Content-Type
header, or if
that header has no charset
parameter, failobj is returned.
get_charsets(failobj=None)
Return a list containing the character set names in the message. If the
message is a multipart
, then the list will contain one element
for each subpart in the payload, otherwise, it will be a list of length 1.
Each item in the list will be a string which is the value of the
charset
parameter in the Content-Type
header for the
represented subpart. If the subpart has no Content-Type
header, no charset
parameter, or is not of the text
main
MIME type, then that item in the returned list will be failobj.
is_attachment()
Return True
if there is a Content-Disposition
header
and its (case insensitive) value is attachment
, False
otherwise.
Changed in version 3.4.2: is_attachment is now a method instead of a property, for consistency
with is_multipart()
.
get_content_disposition()
Return the lowercased value (without parameters) of the message's
Content-Disposition
header if it has one, or None
. The
possible values for this method are inline, attachment or None
if the message follows RFC 2183.
New in version 3.5.
The following methods relate to interrogating and manipulating the content (payload) of the message.
walk()
The walk()
method is an all-purpose generator which can be used to
iterate over all the parts and subparts of a message object tree, in
depth-first traversal order. You will typically use walk()
as the
iterator in a for
loop; each iteration returns the next subpart.
Here's an example that prints the MIME type of every part of a multipart message structure:
>>> for part in msg.walk():
... print(part.get_content_type())
multipart/report
text/plain
message/delivery-status
text/plain
text/plain
message/rfc822
text/plain
walk
iterates over the subparts of any part where
is_multipart()
returns True
, even though
msg.get_content_maintype() == 'multipart'
may return False
. We
can see this in our example by making use of the _structure
debug
helper function:
>>> for part in msg.walk():
... print(part.get_content_maintype() == 'multipart',
... part.is_multipart())
True True
False False
False True
False False
False False
False True
False False
>>> _structure(msg)
multipart/report
text/plain
message/delivery-status
text/plain
text/plain
message/rfc822
text/plain
Here the message
parts are not multiparts
, but they do contain
subparts. is_multipart()
returns True
and walk
descends
into the subparts.
get_body(preferencelist=('related', 'html', 'plain'))
Return the MIME part that is the best candidate to be the "body" of the message.
preferencelist must be a sequence of strings from the set related
,
html
, and plain
, and indicates the order of preference for the
content type of the part returned.
Start looking for candidate matches with the object on which the
get_body
method is called.
If related
is not included in preferencelist, consider the root
part (or subpart of the root part) of any related encountered as a
candidate if the (sub-)part matches a preference.
When encountering a multipart/related
, check the start
parameter
and if a part with a matching Content-ID
is found, consider
only it when looking for candidate matches. Otherwise consider only the
first (default root) part of the multipart/related
.
If a part has a Content-Disposition
header, only consider
the part a candidate match if the value of the header is inline
.
If none of the candidates matches any of the preferences in
preferencelist, return None
.
Notes: (1) For most applications the only preferencelist combinations
that really make sense are ('plain',)
, ('html', 'plain')
, and the
default ('related', 'html', 'plain')
. (2) Because matching starts
with the object on which get_body
is called, calling get_body
on
a multipart/related
will return the object itself unless
preferencelist has a non-default value. (3) Messages (or message parts)
that do not specify a Content-Type
or whose
Content-Type
header is invalid will be treated as if they
are of type text/plain
, which may occasionally cause get_body
to
return unexpected results.
iter_attachments()
Return an iterator over all of the immediate sub-parts of the message
that are not candidate "body" parts. That is, skip the first occurrence
of each of text/plain
, text/html
, multipart/related
, or
multipart/alternative
(unless they are explicitly marked as
attachments via Content-Disposition: attachment
), and
return all remaining parts. When applied directly to a
multipart/related
, return an iterator over the all the related parts
except the root part (ie: the part pointed to by the start
parameter,
or the first part if there is no start
parameter or the start
parameter doesn't match the Content-ID
of any of the
parts). When applied directly to a multipart/alternative
or a
non-multipart
, return an empty iterator.
iter_parts()
Return an iterator over all of the immediate sub-parts of the message,
which will be empty for a non-multipart
. (See also
walk()
.)
get_content(*args, content_manager=None, **kw)
Call the get_content()
method
of the content_manager, passing self as the message object, and passing
along any other arguments or keywords as additional arguments. If
content_manager is not specified, use the content_manager
specified
by the current policy
.
set_content(*args, content_manager=None, **kw)
Call the set_content()
method
of the content_manager, passing self as the message object, and passing
along any other arguments or keywords as additional arguments. If
content_manager is not specified, use the content_manager
specified
by the current policy
.
make_related(boundary=None)
Convert a non-multipart
message into a multipart/related
message,
moving any existing Content-
headers and payload into a
(new) first part of the multipart
. If boundary is specified, use
it as the boundary string in the multipart, otherwise leave the boundary
to be automatically created when it is needed (for example, when the
message is serialized).
make_alternative(boundary=None)
Convert a non-multipart
or a multipart/related
into a
multipart/alternative
, moving any existing Content-
headers and payload into a (new) first part of the multipart
. If
boundary is specified, use it as the boundary string in the multipart,
otherwise leave the boundary to be automatically created when it is
needed (for example, when the message is serialized).
make_mixed(boundary=None)
Convert a non-multipart
, a multipart/related
, or a
multipart-alternative
into a multipart/mixed
, moving any existing
Content-
headers and payload into a (new) first part of the
multipart
. If boundary is specified, use it as the boundary string
in the multipart, otherwise leave the boundary to be automatically
created when it is needed (for example, when the message is serialized).
add_related(*args, content_manager=None, **kw)
If the message is a multipart/related
, create a new message
object, pass all of the arguments to its set_content()
method,
and attach()
it to the multipart
. If
the message is a non-multipart
, call make_related()
and then
proceed as above. If the message is any other type of multipart
,
raise a TypeError
. If content_manager is not specified, use
the content_manager
specified by the current policy
.
If the added part has no Content-Disposition
header,
add one with the value inline
.
add_alternative(*args, content_manager=None, **kw)
If the message is a multipart/alternative
, create a new message
object, pass all of the arguments to its set_content()
method, and
attach()
it to the multipart
. If the
message is a non-multipart
or multipart/related
, call
make_alternative()
and then proceed as above. If the message is
any other type of multipart
, raise a TypeError
. If
content_manager is not specified, use the content_manager
specified
by the current policy
.
add_attachment(*args, content_manager=None, **kw)
If the message is a multipart/mixed
, create a new message object,
pass all of the arguments to its set_content()
method, and
attach()
it to the multipart
. If the
message is a non-multipart
, multipart/related
, or
multipart/alternative
, call make_mixed()
and then proceed as
above. If content_manager is not specified, use the content_manager
specified by the current policy
. If the added part
has no Content-Disposition
header, add one with the value
attachment
. This method can be used both for explicit attachments
(Content-Disposition: attachment
and inline
attachments
(Content-Disposition: inline
), by passing appropriate
options to the content_manager
.
clear()
Remove the payload and all of the headers.
clear_content()
Remove the payload and all of the Content-
headers, leaving
all other headers intact and in their original order.
EmailMessage
objects have the following instance attributes:
preamble
The format of a MIME document allows for some text between the blank line following the headers, and the first multipart boundary string. Normally, this text is never visible in a MIME-aware mail reader because it falls outside the standard MIME armor. However, when viewing the raw text of the message, or when viewing the message in a non-MIME aware reader, this text can become visible.
The preamble attribute contains this leading extra-armor text for MIME
documents. When the Parser
discovers some text
after the headers but before the first boundary string, it assigns this
text to the message's preamble attribute. When the
Generator
is writing out the plain text
representation of a MIME message, and it finds the
message has a preamble attribute, it will write this text in the area
between the headers and the first boundary. See email.parser
and
email.generator
for details.
Note that if the message object has no preamble, the preamble attribute
will be None
.
epilogue
The epilogue attribute acts the same way as the preamble attribute,
except that it contains text that appears between the last boundary and
the end of the message. As with the preamble
,
if there is no epilog text this attribute will be None
.
defects
The defects attribute contains a list of all the problems found when
parsing this message. See email.errors
for a detailed description
of the possible parsing defects.
class email.message.MIMEPart(policy=default)
This class represents a subpart of a MIME message. It is identical to
EmailMessage
, except that no MIME-Version
headers are
added when set_content()
is called, since sub-parts do
not need their own MIME-Version
headers.