Skip to content
Projects
Groups
Snippets
Help
Loading...
Help
Support
Keyboard shortcuts
?
Submit feedback
Contribute to GitLab
Sign in / Register
Toggle navigation
C
cpython
Project overview
Project overview
Details
Activity
Releases
Repository
Repository
Files
Commits
Branches
Tags
Contributors
Graph
Compare
Issues
0
Issues
0
List
Boards
Labels
Milestones
Merge Requests
0
Merge Requests
0
Analytics
Analytics
Repository
Value Stream
Wiki
Wiki
Members
Members
Collapse sidebar
Close sidebar
Activity
Graph
Create a new issue
Commits
Issue Boards
Open sidebar
Kirill Smelkov
cpython
Commits
e53d977e
Commit
e53d977e
authored
Mar 15, 2012
by
Senthil Kumaran
Browse files
Options
Browse Files
Download
Plain Diff
Explain the use of charset parameter with Content-Type header: issue11082
parents
df2aecbf
6b3434ae
Changes
3
Hide whitespace changes
Inline
Side-by-side
Showing
3 changed files
with
58 additions
and
28 deletions
+58
-28
Doc/library/urllib.parse.rst
Doc/library/urllib.parse.rst
+4
-3
Doc/library/urllib.request.rst
Doc/library/urllib.request.rst
+51
-23
Lib/urllib/request.py
Lib/urllib/request.py
+3
-2
No files found.
Doc/library/urllib.parse.rst
View file @
e53d977e
...
@@ -512,9 +512,10 @@ task isn't already covered by the URL parsing functions above.
...
@@ -512,9 +512,10 @@ task isn't already covered by the URL parsing functions above.
Convert a mapping object or a sequence of two-element tuples, which may
Convert a mapping object or a sequence of two-element tuples, which may
either be a :class:`str` or a :class:`bytes`, to a "percent-encoded"
either be a :class:`str` or a :class:`bytes`, to a "percent-encoded"
string. The resultant string must be converted to bytes using the
string. If the resultant string is to be used as a *data* for POST
user-specified encoding before it is sent to :func:`urlopen` as the optional
operation with :func:`urlopen` function, then it should be properly encoded
*data* argument.
to bytes, otherwise it would result in a :exc:`TypeError`.
The resulting string is a series of ``key=value`` pairs separated by ``'&'``
The resulting string is a series of ``key=value`` pairs separated by ``'&'``
characters, where both *key* and *value* are quoted using :func:`quote_plus`
characters, where both *key* and *value* are quoted using :func:`quote_plus`
above. When a sequence of two-element tuples is used as the *query*
above. When a sequence of two-element tuples is used as the *query*
...
...
Doc/library/urllib.request.rst
View file @
e53d977e
...
@@ -2,9 +2,10 @@
...
@@ -2,9 +2,10 @@
=============================================================
=============================================================
.. module:: urllib.request
.. module:: urllib.request
:synopsis:
Next generation URL opening library
.
:synopsis:
Extensible library for opening URLs
.
.. moduleauthor:: Jeremy Hylton
<jeremy
@
alum
.
mit
.
edu
>
.. moduleauthor:: Jeremy Hylton
<jeremy
@
alum
.
mit
.
edu
>
.. sectionauthor:: Moshe Zadka
<moshez
@
users
.
sourceforge
.
net
>
.. sectionauthor:: Moshe Zadka
<moshez
@
users
.
sourceforge
.
net
>
.. sectionauthor:: Senthil Kumaran
<senthil
@
uthcode
.
com
>
The :mod:`urllib.request` module defines functions and classes which help in
The :mod:`urllib.request` module defines functions and classes which help in
...
@@ -20,16 +21,26 @@ The :mod:`urllib.request` module defines the following functions:
...
@@ -20,16 +21,26 @@ The :mod:`urllib.request` module defines the following functions:
Open the URL *url*, which can be either a string or a
Open the URL *url*, which can be either a string or a
:class:`Request` object.
:class:`Request` object.
*data* m
ay be a bytes object specifying additional data to send
to the
*data* m
ust be a bytes object specifying additional data to be sent
to the
server, or ``None`` if no such data is needed. *data* may also be an
server, or ``None`` if no such data is needed. *data* may also be an
iterable object and in that case Content-Length value must be specified in
iterable object and in that case Content-Length value must be specified in
the headers. Currently HTTP requests are the only ones that use *data*; the
the headers. Currently HTTP requests are the only ones that use *data*; the
HTTP request will be a POST instead of a GET when the *data* parameter is
HTTP request will be a POST instead of a GET when the *data* parameter is
provided. *data* should be a buffer in the standard
provided.
*data* should be a buffer in the standard
:mimetype:`application/x-www-form-urlencoded` format. The
:mimetype:`application/x-www-form-urlencoded` format. The
:func:`urllib.parse.urlencode` function takes a mapping or sequence of
:func:`urllib.parse.urlencode` function takes a mapping or sequence of
2-tuples and returns a string in this format. urllib.request module uses
2-tuples and returns a string in this format. It should be encoded to bytes
HTTP/1.1 and includes ``Connection:close`` header in its HTTP requests.
before being used as the *data* parameter. The charset parameter in
``Content-Type`` header may be used to specify the encoding. If charset
parameter is not sent with the Content-Type header, the server following the
HTTP 1.1 recommendation may assume that the data is encoded in ISO-8859-1
encoding. It is advisable to use charset parameter with encoding used in
``Content-Type`` header with the :class:`Request`.
urllib.request module uses HTTP/1.1 and includes ``Connection:close`` header
in its HTTP requests.
The optional *timeout* parameter specifies a timeout in seconds for
The optional *timeout* parameter specifies a timeout in seconds for
blocking operations like the connection attempt (if not specified,
blocking operations like the connection attempt (if not specified,
...
@@ -66,9 +77,10 @@ The :mod:`urllib.request` module defines the following functions:
...
@@ -66,9 +77,10 @@ The :mod:`urllib.request` module defines the following functions:
are handled through the proxy when they are set.
are handled through the proxy when they are set.
The legacy ``urllib.urlopen`` function from Python 2.6 and earlier has been
The legacy ``urllib.urlopen`` function from Python 2.6 and earlier has been
discontinued; :func:`urlopen` corresponds to the old ``urllib2.urlopen``.
discontinued; :func:`urllib.request.urlopen` corresponds to the old
Proxy handling, which was done by passing a dictionary parameter to
``urllib2.urlopen``. Proxy handling, which was done by passing a dictionary
``urllib.urlopen``, can be obtained by using :class:`ProxyHandler` objects.
parameter to ``urllib.urlopen``, can be obtained by using
:class:`ProxyHandler` objects.
.. versionchanged:: 3.2
.. versionchanged:: 3.2
*cafile* and *capath* were added.
*cafile* and *capath* were added.
...
@@ -83,10 +95,11 @@ The :mod:`urllib.request` module defines the following functions:
...
@@ -83,10 +95,11 @@ The :mod:`urllib.request` module defines the following functions:
.. function:: install_opener(opener)
.. function:: install_opener(opener)
Install an :class:`OpenerDirector` instance as the default global opener.
Install an :class:`OpenerDirector` instance as the default global opener.
Installing an opener is only necessary if you want urlopen to use that opener;
Installing an opener is only necessary if you want urlopen to use that
otherwise, simply call :meth:`OpenerDirector.open` instead of :func:`urlopen`.
opener; otherwise, simply call :meth:`OpenerDirector.open` instead of
The code does not check for a real :class:`OpenerDirector`, and any class with
:func:`~urllib.request.urlopen`. The code does not check for a real
the appropriate interface will work.
:class:`OpenerDirector`, and any class with the appropriate interface will
work.
.. function:: build_opener([handler, ...])
.. function:: build_opener([handler, ...])
...
@@ -138,13 +151,21 @@ The following classes are provided:
...
@@ -138,13 +151,21 @@ The following classes are provided:
*url* should be a string containing a valid URL.
*url* should be a string containing a valid URL.
*data* m
ay
be a bytes object specifying additional data to send to the
*data* m
ust
be a bytes object specifying additional data to send to the
server, or ``None`` if no such data is needed. Currently HTTP requests are
server, or ``None`` if no such data is needed. Currently HTTP requests are
the only ones that use *data*; the HTTP request will be a POST instead of a
the only ones that use *data*; the HTTP request will be a POST instead of a
GET when the *data* parameter is provided. *data* should be a buffer in the
GET when the *data* parameter is provided. *data* should be a buffer in the
standard :mimetype:`application/x-www-form-urlencoded` format. The
standard :mimetype:`application/x-www-form-urlencoded` format.
:func:`urllib.parse.urlencode` function takes a mapping or sequence of
2-tuples and returns a string in this format.
The :func:`urllib.parse.urlencode` function takes a mapping or sequence of
2-tuples and returns a string in this format. It should be encoded to bytes
before being used as the *data* parameter. The charset parameter in
``Content-Type`` header may be used to specify the encoding. If charset
parameter is not sent with the Content-Type header, the server following the
HTTP 1.1 recommendation may assume that the data is encoded in ISO-8859-1
encoding. It is advisable to use charset parameter with encoding used in
``Content-Type`` header with the :class:`Request`.
*headers* should be a dictionary, and will be treated as if
*headers* should be a dictionary, and will be treated as if
:meth:`add_header` was called with each key and value as arguments.
:meth:`add_header` was called with each key and value as arguments.
...
@@ -156,8 +177,11 @@ The following classes are provided:
...
@@ -156,8 +177,11 @@ The following classes are provided:
:mod:`urllib`'s default user agent string is
:mod:`urllib`'s default user agent string is
``"Python-urllib/2.6"`` (on Python 2.6).
``"Python-urllib/2.6"`` (on Python 2.6).
The following two arguments, *origin_req_host* and *unverifiable*,
An example of using ``Content-Type`` header with *data* argument would be
are only of interest for correct handling of third-party HTTP cookies:
sending a dictionary like ``{"Content-Type":" application/x-www-form-urlencoded;charset=utf-8"}``
The final two arguments are only of interest for correct handling
of third-party HTTP cookies:
*origin_req_host* should be the request-host of the origin
*origin_req_host* should be the request-host of the origin
transaction, as defined by :rfc:`2965`. It defaults to
transaction, as defined by :rfc:`2965`. It defaults to
...
@@ -1107,8 +1131,9 @@ every :class:`Request`. To change this::
...
@@ -1107,8 +1131,9 @@ every :class:`Request`. To change this::
opener.open('http://www.example.com/')
opener.open('http://www.example.com/')
Also, remember that a few standard headers (:mailheader:`Content-Length`,
Also, remember that a few standard headers (:mailheader:`Content-Length`,
:mailheader:`Content-Type` and :mailheader:`Host`) are added when the
:mailheader:`Content-Type` without charset parameter and :mailheader:`Host`)
:class:`Request` is passed to :func:`urlopen` (or :meth:`OpenerDirector.open`).
are added when the :class:`Request` is passed to :func:`urlopen` (or
:meth:`OpenerDirector.open`).
.. _urllib-examples:
.. _urllib-examples:
...
@@ -1126,9 +1151,12 @@ from urlencode is encoded to bytes before it is sent to urlopen as data::
...
@@ -1126,9 +1151,12 @@ from urlencode is encoded to bytes before it is sent to urlopen as data::
>>> import urllib.request
>>> import urllib.request
>>> import urllib.parse
>>> import urllib.parse
>>> params = urllib.parse.urlencode({'spam': 1, 'eggs': 2, 'bacon': 0})
>>> data = urllib.parse.urlencode({'spam': 1, 'eggs': 2, 'bacon': 0})
>>> params = params.encode('utf-8')
>>> data = data.encode('utf-8')
>>> f = urllib.request.urlopen("http://www.musi-cal.com/cgi-bin/query", params)
>>> request = urllib.request.Request("http://requestb.in/xrbl82xr")
>>> # adding charset parameter to the Content-Type header.
>>> request.add_header("Content-Type","application/x-www-form-urlencoded;charset=utf-8")
>>> f = urllib.request.urlopen(request, data)
>>> print(f.read().decode('utf-8'))
>>> print(f.read().decode('utf-8'))
The following example uses an explicitly specified HTTP proxy, overriding
The following example uses an explicitly specified HTTP proxy, overriding
...
...
Lib/urllib/request.py
View file @
e53d977e
...
@@ -1172,8 +1172,9 @@ class AbstractHTTPHandler(BaseHandler):
...
@@ -1172,8 +1172,9 @@ class AbstractHTTPHandler(BaseHandler):
if
request
.
data
is
not
None
:
# POST
if
request
.
data
is
not
None
:
# POST
data
=
request
.
data
data
=
request
.
data
if
isinstance
(
data
,
str
):
if
isinstance
(
data
,
str
):
raise
TypeError
(
"POST data should be bytes"
msg
=
"POST data should be bytes or an iterable of bytes."
\
" or an iterable of bytes. It cannot be str."
)
"It cannot be str"
raise
TypeError
(
msg
)
if
not
request
.
has_header
(
'Content-type'
):
if
not
request
.
has_header
(
'Content-type'
):
request
.
add_unredirected_header
(
request
.
add_unredirected_header
(
'Content-type'
,
'Content-type'
,
...
...
Write
Preview
Markdown
is supported
0%
Try again
or
attach a new file
Attach a file
Cancel
You are about to add
0
people
to the discussion. Proceed with caution.
Finish editing this message first!
Cancel
Please
register
or
sign in
to comment