Commit bc6f74a5 authored by matthewbelisle-wf's avatar matthewbelisle-wf Committed by Victor Stinner

bpo-34866: Add max_num_fields to cgi.FieldStorage (GH-9660) (GH-9969)

Adding `max_num_fields` to `cgi.FieldStorage` to make DOS attacks harder by
limiting the number of `MiniFieldStorage` objects created by `FieldStorage`.

(cherry picked from commit 20914483)
parent 64ffee7a
...@@ -292,12 +292,12 @@ algorithms implemented in this module in other circumstances. ...@@ -292,12 +292,12 @@ algorithms implemented in this module in other circumstances.
passed to :func:`urlparse.parse_qs` unchanged. passed to :func:`urlparse.parse_qs` unchanged.
.. function:: parse_qs(qs[, keep_blank_values[, strict_parsing]]) .. function:: parse_qs(qs[, keep_blank_values[, strict_parsing[, max_num_fields]]])
This function is deprecated in this module. Use :func:`urlparse.parse_qs` This function is deprecated in this module. Use :func:`urlparse.parse_qs`
instead. It is maintained here only for backward compatibility. instead. It is maintained here only for backward compatibility.
.. function:: parse_qsl(qs[, keep_blank_values[, strict_parsing]]) .. function:: parse_qsl(qs[, keep_blank_values[, strict_parsing[, max_num_fields]]])
This function is deprecated in this module. Use :func:`urlparse.parse_qsl` This function is deprecated in this module. Use :func:`urlparse.parse_qsl`
instead. It is maintained here only for backward compatibility. instead. It is maintained here only for backward compatibility.
......
...@@ -126,7 +126,7 @@ The :mod:`urlparse` module defines the following functions: ...@@ -126,7 +126,7 @@ The :mod:`urlparse` module defines the following functions:
Added IPv6 URL parsing capabilities. Added IPv6 URL parsing capabilities.
.. function:: parse_qs(qs[, keep_blank_values[, strict_parsing]]) .. function:: parse_qs(qs[, keep_blank_values[, strict_parsing[, max_num_fields]]])
Parse a query string given as a string argument (data of type Parse a query string given as a string argument (data of type
:mimetype:`application/x-www-form-urlencoded`). Data are returned as a :mimetype:`application/x-www-form-urlencoded`). Data are returned as a
...@@ -143,14 +143,20 @@ The :mod:`urlparse` module defines the following functions: ...@@ -143,14 +143,20 @@ The :mod:`urlparse` module defines the following functions:
parsing errors. If false (the default), errors are silently ignored. If true, parsing errors. If false (the default), errors are silently ignored. If true,
errors raise a :exc:`ValueError` exception. errors raise a :exc:`ValueError` exception.
The optional argument *max_num_fields* is the maximum number of fields to
read. If set, then throws a :exc:`ValueError` if there are more than
*max_num_fields* fields read.
Use the :func:`urllib.urlencode` function to convert such dictionaries into Use the :func:`urllib.urlencode` function to convert such dictionaries into
query strings. query strings.
.. versionadded:: 2.6 .. versionadded:: 2.6
Copied from the :mod:`cgi` module. Copied from the :mod:`cgi` module.
.. versionchanged:: 2.7.16
Added *max_num_fields* parameter.
.. function:: parse_qsl(qs[, keep_blank_values[, strict_parsing]]) .. function:: parse_qsl(qs[, keep_blank_values[, strict_parsing[, max_num_fields]]])
Parse a query string given as a string argument (data of type Parse a query string given as a string argument (data of type
:mimetype:`application/x-www-form-urlencoded`). Data are returned as a list of :mimetype:`application/x-www-form-urlencoded`). Data are returned as a list of
...@@ -166,12 +172,18 @@ The :mod:`urlparse` module defines the following functions: ...@@ -166,12 +172,18 @@ The :mod:`urlparse` module defines the following functions:
parsing errors. If false (the default), errors are silently ignored. If true, parsing errors. If false (the default), errors are silently ignored. If true,
errors raise a :exc:`ValueError` exception. errors raise a :exc:`ValueError` exception.
The optional argument *max_num_fields* is the maximum number of fields to
read. If set, then throws a :exc:`ValueError` if there are more than
*max_num_fields* fields read.
Use the :func:`urllib.urlencode` function to convert such lists of pairs into Use the :func:`urllib.urlencode` function to convert such lists of pairs into
query strings. query strings.
.. versionadded:: 2.6 .. versionadded:: 2.6
Copied from the :mod:`cgi` module. Copied from the :mod:`cgi` module.
.. versionchanged:: 2.7.16
Added *max_num_fields* parameter.
.. function:: urlunparse(parts) .. function:: urlunparse(parts)
......
...@@ -184,11 +184,12 @@ def parse_qs(qs, keep_blank_values=0, strict_parsing=0): ...@@ -184,11 +184,12 @@ def parse_qs(qs, keep_blank_values=0, strict_parsing=0):
return urlparse.parse_qs(qs, keep_blank_values, strict_parsing) return urlparse.parse_qs(qs, keep_blank_values, strict_parsing)
def parse_qsl(qs, keep_blank_values=0, strict_parsing=0): def parse_qsl(qs, keep_blank_values=0, strict_parsing=0, max_num_fields=None):
"""Parse a query given as a string argument.""" """Parse a query given as a string argument."""
warn("cgi.parse_qsl is deprecated, use urlparse.parse_qsl instead", warn("cgi.parse_qsl is deprecated, use urlparse.parse_qsl instead",
PendingDeprecationWarning, 2) PendingDeprecationWarning, 2)
return urlparse.parse_qsl(qs, keep_blank_values, strict_parsing) return urlparse.parse_qsl(qs, keep_blank_values, strict_parsing,
max_num_fields)
def parse_multipart(fp, pdict): def parse_multipart(fp, pdict):
"""Parse multipart input. """Parse multipart input.
...@@ -393,7 +394,8 @@ class FieldStorage: ...@@ -393,7 +394,8 @@ class FieldStorage:
""" """
def __init__(self, fp=None, headers=None, outerboundary="", def __init__(self, fp=None, headers=None, outerboundary="",
environ=os.environ, keep_blank_values=0, strict_parsing=0): environ=os.environ, keep_blank_values=0, strict_parsing=0,
max_num_fields=None):
"""Constructor. Read multipart/* until last part. """Constructor. Read multipart/* until last part.
Arguments, all optional: Arguments, all optional:
...@@ -420,10 +422,14 @@ class FieldStorage: ...@@ -420,10 +422,14 @@ class FieldStorage:
If false (the default), errors are silently ignored. If false (the default), errors are silently ignored.
If true, errors raise a ValueError exception. If true, errors raise a ValueError exception.
max_num_fields: int. If set, then __init__ throws a ValueError
if there are more than n fields read by parse_qsl().
""" """
method = 'GET' method = 'GET'
self.keep_blank_values = keep_blank_values self.keep_blank_values = keep_blank_values
self.strict_parsing = strict_parsing self.strict_parsing = strict_parsing
self.max_num_fields = max_num_fields
if 'REQUEST_METHOD' in environ: if 'REQUEST_METHOD' in environ:
method = environ['REQUEST_METHOD'].upper() method = environ['REQUEST_METHOD'].upper()
self.qs_on_post = None self.qs_on_post = None
...@@ -606,10 +612,9 @@ class FieldStorage: ...@@ -606,10 +612,9 @@ class FieldStorage:
qs = self.fp.read(self.length) qs = self.fp.read(self.length)
if self.qs_on_post: if self.qs_on_post:
qs += '&' + self.qs_on_post qs += '&' + self.qs_on_post
self.list = list = [] query = urlparse.parse_qsl(qs, self.keep_blank_values,
for key, value in urlparse.parse_qsl(qs, self.keep_blank_values, self.strict_parsing, self.max_num_fields)
self.strict_parsing): self.list = [MiniFieldStorage(key, value) for key, value in query]
list.append(MiniFieldStorage(key, value))
self.skip_lines() self.skip_lines()
FieldStorageClass = None FieldStorageClass = None
...@@ -621,19 +626,38 @@ class FieldStorage: ...@@ -621,19 +626,38 @@ class FieldStorage:
raise ValueError, 'Invalid boundary in multipart form: %r' % (ib,) raise ValueError, 'Invalid boundary in multipart form: %r' % (ib,)
self.list = [] self.list = []
if self.qs_on_post: if self.qs_on_post:
for key, value in urlparse.parse_qsl(self.qs_on_post, query = urlparse.parse_qsl(self.qs_on_post,
self.keep_blank_values, self.strict_parsing): self.keep_blank_values,
self.list.append(MiniFieldStorage(key, value)) self.strict_parsing,
self.max_num_fields)
self.list.extend(MiniFieldStorage(key, value)
for key, value in query)
FieldStorageClass = None FieldStorageClass = None
# Propagate max_num_fields into the sub class appropriately
max_num_fields = self.max_num_fields
if max_num_fields is not None:
max_num_fields -= len(self.list)
klass = self.FieldStorageClass or self.__class__ klass = self.FieldStorageClass or self.__class__
part = klass(self.fp, {}, ib, part = klass(self.fp, {}, ib,
environ, keep_blank_values, strict_parsing) environ, keep_blank_values, strict_parsing,
max_num_fields)
# Throw first part away # Throw first part away
while not part.done: while not part.done:
headers = rfc822.Message(self.fp) headers = rfc822.Message(self.fp)
part = klass(self.fp, headers, ib, part = klass(self.fp, headers, ib,
environ, keep_blank_values, strict_parsing) environ, keep_blank_values, strict_parsing,
max_num_fields)
if max_num_fields is not None:
max_num_fields -= 1
if part.list:
max_num_fields -= len(part.list)
if max_num_fields < 0:
raise ValueError('Max number of fields exceeded')
self.list.append(part) self.list.append(part)
self.skip_lines() self.skip_lines()
......
from io import BytesIO
from test.test_support import run_unittest, check_warnings from test.test_support import run_unittest, check_warnings
import cgi import cgi
import os import os
...@@ -316,6 +317,60 @@ Content-Type: text/plain ...@@ -316,6 +317,60 @@ Content-Type: text/plain
v = gen_result(data, environ) v = gen_result(data, environ)
self.assertEqual(self._qs_result, v) self.assertEqual(self._qs_result, v)
def test_max_num_fields(self):
# For application/x-www-form-urlencoded
data = '&'.join(['a=a']*11)
environ = {
'CONTENT_LENGTH': str(len(data)),
'CONTENT_TYPE': 'application/x-www-form-urlencoded',
'REQUEST_METHOD': 'POST',
}
with self.assertRaises(ValueError):
cgi.FieldStorage(
fp=BytesIO(data.encode()),
environ=environ,
max_num_fields=10,
)
# For multipart/form-data
data = """---123
Content-Disposition: form-data; name="a"
3
---123
Content-Type: application/x-www-form-urlencoded
a=4
---123
Content-Type: application/x-www-form-urlencoded
a=5
---123--
"""
environ = {
'CONTENT_LENGTH': str(len(data)),
'CONTENT_TYPE': 'multipart/form-data; boundary=-123',
'QUERY_STRING': 'a=1&a=2',
'REQUEST_METHOD': 'POST',
}
# 2 GET entities
# 1 top level POST entities
# 1 entity within the second POST entity
# 1 entity within the third POST entity
with self.assertRaises(ValueError):
cgi.FieldStorage(
fp=BytesIO(data.encode()),
environ=environ,
max_num_fields=4,
)
cgi.FieldStorage(
fp=BytesIO(data.encode()),
environ=environ,
max_num_fields=5,
)
def testQSAndFormData(self): def testQSAndFormData(self):
data = """ data = """
---123 ---123
......
...@@ -361,7 +361,7 @@ def unquote(s): ...@@ -361,7 +361,7 @@ def unquote(s):
append(item) append(item)
return ''.join(res) return ''.join(res)
def parse_qs(qs, keep_blank_values=0, strict_parsing=0): def parse_qs(qs, keep_blank_values=0, strict_parsing=0, max_num_fields=None):
"""Parse a query given as a string argument. """Parse a query given as a string argument.
Arguments: Arguments:
...@@ -378,16 +378,20 @@ def parse_qs(qs, keep_blank_values=0, strict_parsing=0): ...@@ -378,16 +378,20 @@ def parse_qs(qs, keep_blank_values=0, strict_parsing=0):
strict_parsing: flag indicating what to do with parsing errors. strict_parsing: flag indicating what to do with parsing errors.
If false (the default), errors are silently ignored. If false (the default), errors are silently ignored.
If true, errors raise a ValueError exception. If true, errors raise a ValueError exception.
max_num_fields: int. If set, then throws a ValueError if there
are more than n fields read by parse_qsl().
""" """
dict = {} dict = {}
for name, value in parse_qsl(qs, keep_blank_values, strict_parsing): for name, value in parse_qsl(qs, keep_blank_values, strict_parsing,
max_num_fields):
if name in dict: if name in dict:
dict[name].append(value) dict[name].append(value)
else: else:
dict[name] = [value] dict[name] = [value]
return dict return dict
def parse_qsl(qs, keep_blank_values=0, strict_parsing=0): def parse_qsl(qs, keep_blank_values=0, strict_parsing=0, max_num_fields=None):
"""Parse a query given as a string argument. """Parse a query given as a string argument.
Arguments: Arguments:
...@@ -404,8 +408,19 @@ def parse_qsl(qs, keep_blank_values=0, strict_parsing=0): ...@@ -404,8 +408,19 @@ def parse_qsl(qs, keep_blank_values=0, strict_parsing=0):
false (the default), errors are silently ignored. If true, false (the default), errors are silently ignored. If true,
errors raise a ValueError exception. errors raise a ValueError exception.
max_num_fields: int. If set, then throws a ValueError if there
are more than n fields read by parse_qsl().
Returns a list, as G-d intended. Returns a list, as G-d intended.
""" """
# If max_num_fields is defined then check that the number of fields
# is less than max_num_fields. This prevents a memory exhaustion DOS
# attack via post bodies with many fields.
if max_num_fields is not None:
num_fields = 1 + qs.count('&') + qs.count(';')
if max_num_fields < num_fields:
raise ValueError('Max number of fields exceeded')
pairs = [s2 for s1 in qs.split('&') for s2 in s1.split(';')] pairs = [s2 for s1 in qs.split('&') for s2 in s1.split(';')]
r = [] r = []
for name_value in pairs: for name_value in pairs:
......
Adding ``max_num_fields`` to ``cgi.FieldStorage`` to make DOS attacks harder by
limiting the number of ``MiniFieldStorage`` objects created by ``FieldStorage``.
Markdown is supported
0%
or
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment