Werkzeug provides a couple of functions to parse and generate HTTP headers that are useful when implementing WSGI middlewares or whenever you are operating on a lower level layer. All this functionality is also exposed from request and response objects.
Date Functions
The following functions simplify working with times in an HTTP context.
Werkzeug uses offset-naive datetime
objects internally
that store the time in UTC. If you’re working with timezones in your
application make sure to replace the tzinfo attribute with a UTC timezone
information before processing the values.
werkzeug.http.cookie_date(expires=None)
Formats the time to ensure compatibility with Netscape’s cookie standard.
Accepts a floating point number expressed in seconds since the epoch in, a
datetime object or a timetuple. All times in UTC. The parse_date()
function can be used to parse such a date.
Outputs a string in the format Wdy, DD-Mon-YYYY HH:MM:SS GMT
.
werkzeug.http.http_date(timestamp=None)
Formats the time to match the RFC1123 date format.
Accepts a floating point number expressed in seconds since the epoch in, a
datetime object or a timetuple. All times in UTC. The parse_date()
function can be used to parse such a date.
Outputs a string in the format Wdy, DD Mon YYYY HH:MM:SS GMT
.
werkzeug.http.parse_date(value)
Parse one of the following date formats into a datetime object:
Sun, 06 Nov 1994 08:49:37 GMT ; RFC 822, updated by RFC 1123
Sunday, 06-Nov-94 08:49:37 GMT ; RFC 850, obsoleted by RFC 1036
Sun Nov 6 08:49:37 1994 ; ANSI C's asctime() format
If parsing fails the return value is [UNKNOWN NODE title_reference].
datetime.datetime
object.Header Parsing
The following functions can be used to parse incoming HTTP headers. Because Python does not provide data structures with the semantics required by RFC 2616, Werkzeug implements some custom data structures that are documented separately.
werkzeug.http.parse_options_header(value, multiple=False)
Parse a Content-Type
like header into a tuple with the content
type and the options:
This should not be used to parse Cache-Control
like headers that use
a slightly different format. For these headers use the
parse_dict_header()
function.
New in version 0.5.
- value – the header to parse.
- multiple – Whether try to parse and return multiple MIME types
werkzeug.http.parse_set_header(value, on_update=None)
Parse a set-like header and return a
HeaderSet
object:
The return value is an object that treats the items case-insensitively and keeps the order of the items:
[UNKNOWN NODE doctest_block]To create a header from the HeaderSet
again, use the
dump_header()
function.
werkzeug.http.parse_list_header(value)
Parse lists as described by RFC 2068 Section 2.
In particular, parse comma-separated lists where the elements of the list may include quoted-strings. A quoted-string could contain a comma. A non-quoted string could have quotes in the middle. Quotes are removed automatically after parsing.
It basically works like parse_set_header()
just that items
may appear multiple times and case sensitivity is preserved.
The return value is a standard list
:
To create a header from the list
again, use the
dump_header()
function.
werkzeug.http.parse_dict_header(value, cls=<type 'dict'>)
Parse lists of key, value pairs as described by RFC 2068 Section 2 and convert them into a python dict (or any other mapping object created from the type with a dict like interface provided by the [UNKNOWN NODE title_reference] argument):
[UNKNOWN NODE doctest_block]If there is no value for a key it will be [UNKNOWN NODE title_reference]:
[UNKNOWN NODE doctest_block]To create a header from the dict
again, use the
dump_header()
function.
Changed in version 0.9: Added support for [UNKNOWN NODE title_reference] argument.
- value – a string with a dict header.
- cls – callable to use for storage of parsed results.
werkzeug.http.parse_accept_header(value[, class])
Parses an HTTP Accept-* header. This does not implement a complete valid algorithm but one that supports at least value and quality extraction.
Returns a new Accept
object (basically a list of (value, quality)
tuples sorted by the quality with some additional accessor methods).
The second parameter can be a subclass of Accept
that is created
with the parsed values and returned.
- value – the accept header string to be parsed.
- cls – the wrapper class for the return value (can be
Accept
or a subclass thereof)
werkzeug.http.parse_cache_control_header(value, on_update=None, cls=None)
Parse a cache control header. The RFC differs between response and request cache control, this method does not. It’s your responsibility to not use the wrong control statements.
New in version 0.5: The [UNKNOWN NODE title_reference] was added. If not specified an immutable
RequestCacheControl
is returned.
- value – a cache control header to be parsed.
- on_update – an optional callable that is called every time a value
on the
CacheControl
object is changed. - cls – the class for the returned object. By default
RequestCacheControl
is used.
werkzeug.http.parse_authorization_header(value)
Parse an HTTP basic/digest authorization header transmitted by the web
browser. The return value is either [UNKNOWN NODE title_reference] if the header was invalid or
not given, otherwise an Authorization
object.
Authorization
object or [UNKNOWN NODE title_reference].werkzeug.http.parse_www_authenticate_header(value, on_update=None)
Parse an HTTP WWW-Authenticate header into a
WWWAuthenticate
object.
- value – a WWW-Authenticate header to parse.
- on_update – an optional callable that is called every time a value
on the
WWWAuthenticate
object is changed.
WWWAuthenticate
object.werkzeug.http.parse_if_range_header(value)
Parses an if-range header which can be an etag or a date. Returns
a IfRange
object.
New in version 0.7.
werkzeug.http.parse_range_header(value, make_inclusive=True)
Parses a range header into a Range
object. If the header is missing or malformed [UNKNOWN NODE title_reference] is returned.
[UNKNOWN NODE title_reference] is a list of (start, stop)
tuples where the ranges are
non-inclusive.
New in version 0.7.
werkzeug.http.parse_content_range_header(value, on_update=None)
Parses a range header into a
ContentRange
object or [UNKNOWN NODE title_reference] if
parsing is not possible.
New in version 0.7.
- value – a content range header to be parsed.
- on_update – an optional callable that is called every time a value
on the
ContentRange
object is changed.
Header Utilities
The following utilities operate on HTTP headers well but do not parse them. They are useful if you’re dealing with conditional responses or if you want to proxy arbitrary requests but want to remove WSGI-unsupported hop-by-hop headers. Also there is a function to create HTTP header strings from the parsed data.
werkzeug.http.is_entity_header(header)
Check if a header is an entity header.
New in version 0.5.
werkzeug.http.is_hop_by_hop_header(header)
Check if a header is an HTTP/1.1 “Hop-by-Hop” header.
New in version 0.5.
werkzeug.http.remove_entity_headers(headers, allowed=('expires', 'content-location'))
Remove all entity headers from a list or Headers
object. This
operation works in-place. [UNKNOWN NODE title_reference] and [UNKNOWN NODE title_reference] headers are
by default not removed. The reason for this is RFC 2616 section
10.3.5 which specifies some entity headers that should be sent.
Changed in version 0.5: added [UNKNOWN NODE title_reference] parameter.
- headers – a list or
Headers
object. - allowed – a list of headers that should still be allowed even though they are entity headers.
werkzeug.http.remove_hop_by_hop_headers(headers)
Remove all HTTP/1.1 “Hop-by-Hop” headers from a list or
Headers
object. This operation works in-place.
New in version 0.5.
Headers
object.werkzeug.http.is_byte_range_valid(start, stop, length)
Checks if a given byte content range is valid for the given length.
New in version 0.7.
werkzeug.http.quote_header_value(value, extra_chars='', allow_token=True)
Quote a header value if necessary.
New in version 0.5.
- value – the value to quote.
- extra_chars – a list of extra characters to skip quoting.
- allow_token – if this is enabled token values are returned unchanged.
werkzeug.http.unquote_header_value(value, is_filename=False)
Unquotes a header value. (Reversal of quote_header_value()
).
This does not use the real unquoting but what browsers are actually
using for quoting.
New in version 0.5.
werkzeug.http.dump_header(iterable, allow_token=True)
Dump an HTTP header again. This is the reversal of
parse_list_header()
, parse_set_header()
and
parse_dict_header()
. This also quotes strings that include an
equals sign unless you pass it as dict of key, value pairs.
- iterable – the iterable or dict of values to quote.
- allow_token – if set to [UNKNOWN NODE title_reference] tokens as values are disallowed.
See
quote_header_value()
for more details.
Conditional Response Helpers
For conditional responses the following functions might be useful:
werkzeug.http.parse_etags(value)
Parse an etag header.
ETags
object.werkzeug.http.quote_etag(etag, weak=False)
Quote an etag.
- etag – the etag to quote.
- weak – set to [UNKNOWN NODE title_reference] to tag it “weak”.
werkzeug.http.unquote_etag(etag)
Unquote a single etag:
[UNKNOWN NODE doctest_block](etag, weak)
tuple.werkzeug.http.generate_etag(data)
Generate an etag for some data.
werkzeug.http.is_resource_modified(environ, etag=None, data=None, last_modified=None, ignore_if_range=True)
Convenience method for conditional requests.
- environ – the WSGI environment of the request to be checked.
- etag – the etag for the response for comparison.
- data – or alternatively the data of the response to automatically
generate an etag using
generate_etag()
. - last_modified – an optional date of the last modification.
- ignore_if_range – If [UNKNOWN NODE title_reference], [UNKNOWN NODE title_reference] header will be taken into account.
Constants
werkzeug.http.HTTP_STATUS_CODES
A dict of status code -> default status message pairs. This is used by the wrappers and other places where an integer status code is expanded to a string throughout Werkzeug.
Form Data Parsing
Werkzeug provides the form parsing functions separately from the request object so that you can access form data from a plain WSGI environment.
The following formats are currently supported by the form data parser:
- [UNKNOWN NODE title_reference]
- [UNKNOWN NODE title_reference]
Nested multipart is not currently supported (Werkzeug 0.9), but it isn’t used by any of the modern web browsers.
Usage example:
[UNKNOWN NODE doctest_block]Normally the WSGI environment is provided by the WSGI gateway with the
incoming data as part of it. If you want to generate such fake-WSGI
environments for unittesting you might want to use the
create_environ()
function or the EnvironBuilder
instead.
class werkzeug.formparser.FormDataParser(stream_factory=None, charset='utf-8', errors='replace', max_form_memory_size=None, max_content_length=None, cls=None, silent=True)
This class implements parsing of form data for Werkzeug. By itself it can parse multipart and url encoded form data. It can be subclassed and extended but for most mimetypes it is a better idea to use the untouched stream and expose it as separate attributes on a request object.
New in version 0.8.
- stream_factory – An optional callable that returns a new read and
writeable file descriptor. This callable works
the same as
_get_file_stream()
. - charset – The character set for URL and url encoded form data.
- errors – The encoding error behavior.
- max_form_memory_size – the maximum number of bytes to be accepted for
in-memory stored form data. If the data
exceeds the value specified an
RequestEntityTooLarge
exception is raised. - max_content_length – If this is provided and the transmitted data
is longer than this value an
RequestEntityTooLarge
exception is raised. - cls – an optional dict class to use. If this is not specified
or [UNKNOWN NODE title_reference] the default
MultiDict
is used. - silent – If set to False parsing errors will not be caught.
werkzeug.formparser.parse_form_data(environ, stream_factory=None, charset='utf-8', errors='replace', max_form_memory_size=None, max_content_length=None, cls=None, silent=True)
Parse the form data in the environ and return it as tuple in the form
(stream, form, files)
. You should only call this method if the
transport method is [UNKNOWN NODE title_reference], [UNKNOWN NODE title_reference], or [UNKNOWN NODE title_reference].
If the mimetype of the data transmitted is [UNKNOWN NODE title_reference] the files multidict will be filled with [UNKNOWN NODE title_reference] objects. If the mimetype is unknown the input stream is wrapped and returned as first argument, else the stream is empty.
This is a shortcut for the common usage of FormDataParser
.
Have a look at Dealing with Request Data for more details.
New in version 0.5: The [UNKNOWN NODE title_reference], [UNKNOWN NODE title_reference] and [UNKNOWN NODE title_reference] parameters were added.
New in version 0.5.1: The optional [UNKNOWN NODE title_reference] flag was added.
- environ – the WSGI environment to be used for parsing.
- stream_factory – An optional callable that returns a new read and
writeable file descriptor. This callable works
the same as
_get_file_stream()
. - charset – The character set for URL and url encoded form data.
- errors – The encoding error behavior.
- max_form_memory_size – the maximum number of bytes to be accepted for
in-memory stored form data. If the data
exceeds the value specified an
RequestEntityTooLarge
exception is raised. - max_content_length – If this is provided and the transmitted data
is longer than this value an
RequestEntityTooLarge
exception is raised. - cls – an optional dict class to use. If this is not specified
or [UNKNOWN NODE title_reference] the default
MultiDict
is used. - silent – If set to False parsing errors will not be caught.
(stream, form, files)
.werkzeug.formparser.parse_multipart_headers(iterable)
Parses multipart headers from an iterable that yields lines (including the trailing newline symbol). The iterable has to be newline terminated.
The iterable will stop at the line where the headers ended so it can be further consumed.