Commit bbc85f30 authored by Benjamin Peterson's avatar Benjamin Peterson

port simplejson upgrade from the trunk #4136

json also now works only with unicode strings

Patch by Antoine Pitrou; updated by me
parent 9d5e41f9
...@@ -112,7 +112,7 @@ Using json.tool from the shell to validate and pretty-print:: ...@@ -112,7 +112,7 @@ Using json.tool from the shell to validate and pretty-print::
Basic Usage Basic Usage
----------- -----------
.. function:: dump(obj, fp[, skipkeys[, ensure_ascii[, check_circular[, allow_nan[, cls[, indent[, separators[, encoding[, default[, **kw]]]]]]]]]]) .. function:: dump(obj, fp[, skipkeys[, ensure_ascii[, check_circular[, allow_nan[, cls[, indent[, separators[, default[, **kw]]]]]]]]]])
Serialize *obj* as a JSON formatted stream to *fp* (a ``.write()``-supporting Serialize *obj* as a JSON formatted stream to *fp* (a ``.write()``-supporting
file-like object). file-like object).
...@@ -122,11 +122,10 @@ Basic Usage ...@@ -122,11 +122,10 @@ Basic Usage
:class:`float`, :class:`bool`, ``None``) will be skipped instead of raising a :class:`float`, :class:`bool`, ``None``) will be skipped instead of raising a
:exc:`TypeError`. :exc:`TypeError`.
If *ensure_ascii* is ``False`` (default: ``True``), then some chunks written The :mod:`json` module always produces :class:`str` objects, not
to *fp* may be :class:`unicode` instances, subject to normal Python :class:`bytes` objects. Therefore, ``fp.write()`` must support :class:`str`
:class:`str` to :class:`unicode` coercion rules. Unless ``fp.write()`` input.
explicitly understands :class:`unicode` (as in :func:`codecs.getwriter`) this
is likely to cause an error.
If *check_circular* is ``False`` (default: ``True``), then the circular If *check_circular* is ``False`` (default: ``True``), then the circular
reference check for container types will be skipped and a circular reference reference check for container types will be skipped and a circular reference
...@@ -146,8 +145,6 @@ Basic Usage ...@@ -146,8 +145,6 @@ Basic Usage
will be used instead of the default ``(', ', ': ')`` separators. ``(',', will be used instead of the default ``(', ', ': ')`` separators. ``(',',
':')`` is the most compact JSON representation. ':')`` is the most compact JSON representation.
*encoding* is the character encoding for str instances, default is UTF-8.
*default(obj)* is a function that should return a serializable version of *default(obj)* is a function that should return a serializable version of
*obj* or raise :exc:`TypeError`. The default simply raises :exc:`TypeError`. *obj* or raise :exc:`TypeError`. The default simply raises :exc:`TypeError`.
...@@ -156,26 +153,17 @@ Basic Usage ...@@ -156,26 +153,17 @@ Basic Usage
*cls* kwarg. *cls* kwarg.
.. function:: dumps(obj[, skipkeys[, ensure_ascii[, check_circular[, allow_nan[, cls[, indent[, separators[, encoding[, default[, **kw]]]]]]]]]]) .. function:: dumps(obj[, skipkeys[, ensure_ascii[, check_circular[, allow_nan[, cls[, indent[, separators[, default[, **kw]]]]]]]]]])
Serialize *obj* to a JSON formatted :class:`str`. Serialize *obj* to a JSON formatted :class:`str`. The arguments have the
same meaning as in :func:`dump`.
If *ensure_ascii* is ``False``, then the return value will be a
:class:`unicode` instance. The other arguments have the same meaning as in
:func:`dump`.
.. function:: load(fp[, cls[, object_hook[, parse_float[, parse_int[, parse_constant[, object_pairs_hook[, **kw]]]]]]]])
.. function:: load(fp[, encoding[, cls[, object_hook[, parse_float[, parse_int[, parse_constant[, object_pairs_hook[, **kw]]]]]]]])
Deserialize *fp* (a ``.read()``-supporting file-like object containing a JSON Deserialize *fp* (a ``.read()``-supporting file-like object containing a JSON
document) to a Python object. document) to a Python object.
If the contents of *fp* are encoded with an ASCII based encoding other than
UTF-8 (e.g. latin-1), then an appropriate *encoding* name must be specified.
Encodings that are not ASCII based (such as UCS-2) are not allowed, and
should be wrapped with ``codecs.getreader(encoding)(fp)``, or simply decoded
to a :class:`unicode` object and passed to :func:`loads`.
*object_hook* is an optional function that will be called with the result of *object_hook* is an optional function that will be called with the result of
any object literal decode (a :class:`dict`). The return value of any object literal decode (a :class:`dict`). The return value of
*object_hook* will be used instead of the :class:`dict`. This feature can be used *object_hook* will be used instead of the :class:`dict`. This feature can be used
...@@ -241,7 +229,7 @@ Encoders and decoders ...@@ -241,7 +229,7 @@ Encoders and decoders
+---------------+-------------------+ +---------------+-------------------+
| array | list | | array | list |
+---------------+-------------------+ +---------------+-------------------+
| string | unicode | | string | str |
+---------------+-------------------+ +---------------+-------------------+
| number (int) | int | | number (int) | int |
+---------------+-------------------+ +---------------+-------------------+
...@@ -257,13 +245,6 @@ Encoders and decoders ...@@ -257,13 +245,6 @@ Encoders and decoders
It also understands ``NaN``, ``Infinity``, and ``-Infinity`` as their It also understands ``NaN``, ``Infinity``, and ``-Infinity`` as their
corresponding ``float`` values, which is outside the JSON spec. corresponding ``float`` values, which is outside the JSON spec.
*encoding* determines the encoding used to interpret any :class:`str` objects
decoded by this instance (UTF-8 by default). It has no effect when decoding
:class:`unicode` objects.
Note that currently only encodings that are a superset of ASCII work, strings
of other encodings should be passed in as :class:`unicode`.
*object_hook*, if specified, will be called with the result of every JSON *object_hook*, if specified, will be called with the result of every JSON
object decoded and its return value will be used in place of the given object decoded and its return value will be used in place of the given
:class:`dict`. This can be used to provide custom deserializations (e.g. to :class:`dict`. This can be used to provide custom deserializations (e.g. to
...@@ -298,20 +279,20 @@ Encoders and decoders ...@@ -298,20 +279,20 @@ Encoders and decoders
.. method:: decode(s) .. method:: decode(s)
Return the Python representation of *s* (a :class:`str` or Return the Python representation of *s* (a :class:`str` instance
:class:`unicode` instance containing a JSON document) containing a JSON document)
.. method:: raw_decode(s) .. method:: raw_decode(s)
Decode a JSON document from *s* (a :class:`str` or :class:`unicode` Decode a JSON document from *s* (a :class:`str` beginning with a
beginning with a JSON document) and return a 2-tuple of the Python JSON document) and return a 2-tuple of the Python representation
representation and the index in *s* where the document ended. and the index in *s* where the document ended.
This can be used to decode a JSON document from a string that may have This can be used to decode a JSON document from a string that may have
extraneous data at the end. extraneous data at the end.
.. class:: JSONEncoder([skipkeys[, ensure_ascii[, check_circular[, allow_nan[, sort_keys[, indent[, separators[, encoding[, default]]]]]]]]]) .. class:: JSONEncoder([skipkeys[, ensure_ascii[, check_circular[, allow_nan[, sort_keys[, indent[, separators[, default]]]]]]]])
Extensible JSON encoder for Python data structures. Extensible JSON encoder for Python data structures.
...@@ -324,7 +305,7 @@ Encoders and decoders ...@@ -324,7 +305,7 @@ Encoders and decoders
+-------------------+---------------+ +-------------------+---------------+
| list, tuple | array | | list, tuple | array |
+-------------------+---------------+ +-------------------+---------------+
| str, unicode | string | | str | string |
+-------------------+---------------+ +-------------------+---------------+
| int, float | number | | int, float | number |
+-------------------+---------------+ +-------------------+---------------+
...@@ -344,9 +325,9 @@ Encoders and decoders ...@@ -344,9 +325,9 @@ Encoders and decoders
attempt encoding of keys that are not str, int, float or None. If attempt encoding of keys that are not str, int, float or None. If
*skipkeys* is ``True``, such items are simply skipped. *skipkeys* is ``True``, such items are simply skipped.
If *ensure_ascii* is ``True`` (the default), the output is guaranteed to be If *ensure_ascii* is ``True`` (the default), the output is guaranteed to
:class:`str` objects with all incoming unicode characters escaped. If have all incoming non-ASCII characters escaped. If *ensure_ascii* is
*ensure_ascii* is ``False``, the output will be a unicode object. ``False``, these characters will be output as-is.
If *check_circular* is ``True`` (the default), then lists, dicts, and custom If *check_circular* is ``True`` (the default), then lists, dicts, and custom
encoded objects will be checked for circular references during encoding to encoded objects will be checked for circular references during encoding to
...@@ -376,10 +357,6 @@ Encoders and decoders ...@@ -376,10 +357,6 @@ Encoders and decoders
otherwise be serialized. It should return a JSON encodable version of the otherwise be serialized. It should return a JSON encodable version of the
object or raise a :exc:`TypeError`. object or raise a :exc:`TypeError`.
If *encoding* is not ``None``, then all input strings will be transformed
into unicode using that encoding prior to JSON-encoding. The default is
UTF-8.
.. method:: default(o) .. method:: default(o)
......
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
"""Iterator based sre token scanner """JSON token scanner
""" """
import re import re
import sre_parse try:
import sre_compile from _json import make_scanner as c_make_scanner
import sre_constants except ImportError:
c_make_scanner = None
from re import VERBOSE, MULTILINE, DOTALL __all__ = ['make_scanner']
from sre_constants import BRANCH, SUBPATTERN
__all__ = ['Scanner', 'pattern'] NUMBER_RE = re.compile(
r'(-?(?:0|[1-9]\d*))(\.\d+)?([eE][-+]?\d+)?',
(re.VERBOSE | re.MULTILINE | re.DOTALL))
FLAGS = (VERBOSE | MULTILINE | DOTALL) def py_make_scanner(context):
parse_object = context.parse_object
parse_array = context.parse_array
parse_string = context.parse_string
match_number = NUMBER_RE.match
strict = context.strict
parse_float = context.parse_float
parse_int = context.parse_int
parse_constant = context.parse_constant
object_hook = context.object_hook
class Scanner(object): def _scan_once(string, idx):
def __init__(self, lexicon, flags=FLAGS):
self.actions = [None]
# Combine phrases into a compound pattern
s = sre_parse.Pattern()
s.flags = flags
p = []
for idx, token in enumerate(lexicon):
phrase = token.pattern
try: try:
subpattern = sre_parse.SubPattern(s, nextchar = string[idx]
[(SUBPATTERN, (idx + 1, sre_parse.parse(phrase, flags)))]) except IndexError:
except sre_constants.error: raise StopIteration
raise
p.append(subpattern)
self.actions.append(token)
s.groups = len(p) + 1 # NOTE(guido): Added to make SRE validation work
p = sre_parse.SubPattern(s, [(BRANCH, (None, p))])
self.scanner = sre_compile.compile(p)
def iterscan(self, string, idx=0, context=None): if nextchar == '"':
"""Yield match, end_idx for each match return parse_string(string, idx + 1, strict)
elif nextchar == '{':
return parse_object((string, idx + 1), strict,
_scan_once, object_hook, object_pairs_hook)
elif nextchar == '[':
return parse_array((string, idx + 1), _scan_once)
elif nextchar == 'n' and string[idx:idx + 4] == 'null':
return None, idx + 4
elif nextchar == 't' and string[idx:idx + 4] == 'true':
return True, idx + 4
elif nextchar == 'f' and string[idx:idx + 5] == 'false':
return False, idx + 5
""" m = match_number(string, idx)
match = self.scanner.scanner(string, idx).match if m is not None:
actions = self.actions integer, frac, exp = m.groups()
lastend = idx if frac or exp:
end = len(string) res = parse_float(integer + (frac or '') + (exp or ''))
while True: else:
m = match() res = parse_int(integer)
if m is None: return res, m.end()
break elif nextchar == 'N' and string[idx:idx + 3] == 'NaN':
matchbegin, matchend = m.span() return parse_constant('NaN'), idx + 3
if lastend == matchend: elif nextchar == 'I' and string[idx:idx + 8] == 'Infinity':
break return parse_constant('Infinity'), idx + 8
action = actions[m.lastindex] elif nextchar == '-' and string[idx:idx + 9] == '-Infinity':
if action is not None: return parse_constant('-Infinity'), idx + 9
rval, next_pos = action(m, context) else:
if next_pos is not None and next_pos != matchend: raise StopIteration
# "fast forward" the scanner
matchend = next_pos
match = self.scanner.scanner(string, matchend).match
yield rval, matchend
lastend = matchend
return _scan_once
def pattern(pattern, flags=FLAGS): make_scanner = c_make_scanner or py_make_scanner
def decorator(fn):
fn.pattern = pattern
fn.regex = re.compile(pattern, flags)
return fn
return decorator
...@@ -32,3 +32,10 @@ class TestDecode(TestCase): ...@@ -32,3 +32,10 @@ class TestDecode(TestCase):
object_pairs_hook = OrderedDict, object_pairs_hook = OrderedDict,
object_hook = lambda x: None), object_hook = lambda x: None),
OrderedDict(p)) OrderedDict(p))
def test_decoder_optimizations(self):
# Several optimizations were made that skip over calls to
# the whitespace regex, so this test is designed to try and
# exercise the uncommon cases. The array cases are already covered.
rval = json.loads('{ "key" : "value" , "k":"v" }')
self.assertEquals(rval, {"key":"value", "k":"v"})
...@@ -11,3 +11,11 @@ class TestDump(TestCase): ...@@ -11,3 +11,11 @@ class TestDump(TestCase):
def test_dumps(self): def test_dumps(self):
self.assertEquals(json.dumps({}), '{}') self.assertEquals(json.dumps({}), '{}')
def test_encode_truefalse(self):
self.assertEquals(json.dumps(
{True: False, False: True}, sort_keys=True),
'{"false": true, "true": false}')
self.assertEquals(json.dumps(
{2: 3.0, 4.0: 5, False: 1, 6: True}, sort_keys=True),
'{"false": 1, "2": 3.0, "4.0": 5, "6": true}')
...@@ -3,22 +3,20 @@ from unittest import TestCase ...@@ -3,22 +3,20 @@ from unittest import TestCase
import json.encoder import json.encoder
CASES = [ CASES = [
('/\\"\ucafe\ubabe\uab98\ufcde\ubcda\uef4a\x08\x0c\n\r\t`1~!@#$%^&*()_+-=[]{}|;:\',./<>?', b'"/\\\\\\"\\ucafe\\ubabe\\uab98\\ufcde\\ubcda\\uef4a\\b\\f\\n\\r\\t`1~!@#$%^&*()_+-=[]{}|;:\',./<>?"'), ('/\\"\ucafe\ubabe\uab98\ufcde\ubcda\uef4a\x08\x0c\n\r\t`1~!@#$%^&*()_+-=[]{}|;:\',./<>?', '"/\\\\\\"\\ucafe\\ubabe\\uab98\\ufcde\\ubcda\\uef4a\\b\\f\\n\\r\\t`1~!@#$%^&*()_+-=[]{}|;:\',./<>?"'),
('\u0123\u4567\u89ab\ucdef\uabcd\uef4a', b'"\\u0123\\u4567\\u89ab\\ucdef\\uabcd\\uef4a"'), ('\u0123\u4567\u89ab\ucdef\uabcd\uef4a', '"\\u0123\\u4567\\u89ab\\ucdef\\uabcd\\uef4a"'),
('controls', b'"controls"'), ('controls', '"controls"'),
('\x08\x0c\n\r\t', b'"\\b\\f\\n\\r\\t"'), ('\x08\x0c\n\r\t', '"\\b\\f\\n\\r\\t"'),
('{"object with 1 member":["array with 1 element"]}', b'"{\\"object with 1 member\\":[\\"array with 1 element\\"]}"'), ('{"object with 1 member":["array with 1 element"]}', '"{\\"object with 1 member\\":[\\"array with 1 element\\"]}"'),
(' s p a c e d ', b'" s p a c e d "'), (' s p a c e d ', '" s p a c e d "'),
('\U0001d120', b'"\\ud834\\udd20"'), ('\U0001d120', '"\\ud834\\udd20"'),
('\u03b1\u03a9', b'"\\u03b1\\u03a9"'), ('\u03b1\u03a9', '"\\u03b1\\u03a9"'),
(b'\xce\xb1\xce\xa9', b'"\\u03b1\\u03a9"'), ('\u03b1\u03a9', '"\\u03b1\\u03a9"'),
('\u03b1\u03a9', b'"\\u03b1\\u03a9"'), ('\u03b1\u03a9', '"\\u03b1\\u03a9"'),
(b'\xce\xb1\xce\xa9', b'"\\u03b1\\u03a9"'), ('\u03b1\u03a9', '"\\u03b1\\u03a9"'),
('\u03b1\u03a9', b'"\\u03b1\\u03a9"'), ("`1~!@#$%^&*()_+-={':[,]}|;.</>?", '"`1~!@#$%^&*()_+-={\':[,]}|;.</>?"'),
('\u03b1\u03a9', b'"\\u03b1\\u03a9"'), ('\x08\x0c\n\r\t', '"\\b\\f\\n\\r\\t"'),
("`1~!@#$%^&*()_+-={':[,]}|;.</>?", b'"`1~!@#$%^&*()_+-={\':[,]}|;.</>?"'), ('\u0123\u4567\u89ab\ucdef\uabcd\uef4a', '"\\u0123\\u4567\\u89ab\\ucdef\\uabcd\\uef4a"'),
('\x08\x0c\n\r\t', b'"\\b\\f\\n\\r\\t"'),
('\u0123\u4567\u89ab\ucdef\uabcd\uef4a', b'"\\u0123\\u4567\\u89ab\\ucdef\\uabcd\\uef4a"'),
] ]
class TestEncodeBaseStringAscii(TestCase): class TestEncodeBaseStringAscii(TestCase):
...@@ -26,12 +24,14 @@ class TestEncodeBaseStringAscii(TestCase): ...@@ -26,12 +24,14 @@ class TestEncodeBaseStringAscii(TestCase):
self._test_encode_basestring_ascii(json.encoder.py_encode_basestring_ascii) self._test_encode_basestring_ascii(json.encoder.py_encode_basestring_ascii)
def test_c_encode_basestring_ascii(self): def test_c_encode_basestring_ascii(self):
if json.encoder.c_encode_basestring_ascii is not None: if not json.encoder.c_encode_basestring_ascii:
return
self._test_encode_basestring_ascii(json.encoder.c_encode_basestring_ascii) self._test_encode_basestring_ascii(json.encoder.c_encode_basestring_ascii)
def _test_encode_basestring_ascii(self, encode_basestring_ascii): def _test_encode_basestring_ascii(self, encode_basestring_ascii):
fname = encode_basestring_ascii.__name__ fname = encode_basestring_ascii.__name__
for input_string, expect in CASES: for input_string, expect in CASES:
result = encode_basestring_ascii(input_string) result = encode_basestring_ascii(input_string)
result = result.encode("ascii") self.assertEquals(result, expect,
self.assertEquals(result, expect) '{0!r} != {1!r} for {2}({3!r})'.format(
result, expect, fname, input_string))
...@@ -73,4 +73,4 @@ class TestFail(TestCase): ...@@ -73,4 +73,4 @@ class TestFail(TestCase):
except ValueError: except ValueError:
pass pass
else: else:
self.fail("Expected failure for fail%d.json: %r" % (idx, doc)) self.fail("Expected failure for fail{0}.json: {1!r}".format(idx, doc))
...@@ -5,5 +5,11 @@ import json ...@@ -5,5 +5,11 @@ import json
class TestFloat(TestCase): class TestFloat(TestCase):
def test_floats(self): def test_floats(self):
for num in [1617161771.7650001, math.pi, math.pi**100, math.pi**-100]: for num in [1617161771.7650001, math.pi, math.pi**100, math.pi**-100, 3.1]:
self.assertEquals(float(json.dumps(num)), num) self.assertEquals(float(json.dumps(num)), num)
self.assertEquals(json.loads(json.dumps(num)), num)
def test_ints(self):
for num in [1, 1<<32, 1<<64]:
self.assertEquals(json.dumps(num), str(num))
self.assertEquals(int(json.dumps(num)), num)
...@@ -15,96 +15,90 @@ class TestScanString(TestCase): ...@@ -15,96 +15,90 @@ class TestScanString(TestCase):
def _test_scanstring(self, scanstring): def _test_scanstring(self, scanstring):
self.assertEquals( self.assertEquals(
scanstring('"z\\ud834\\udd20x"', 1, None, True), scanstring('"z\\ud834\\udd20x"', 1, True),
('z\U0001d120x', 16)) ('z\U0001d120x', 16))
if sys.maxunicode == 65535: if sys.maxunicode == 65535:
self.assertEquals( self.assertEquals(
scanstring('"z\U0001d120x"', 1, None, True), scanstring('"z\U0001d120x"', 1, True),
('z\U0001d120x', 6)) ('z\U0001d120x', 6))
else: else:
self.assertEquals( self.assertEquals(
scanstring('"z\U0001d120x"', 1, None, True), scanstring('"z\U0001d120x"', 1, True),
('z\U0001d120x', 5)) ('z\U0001d120x', 5))
self.assertEquals( self.assertEquals(
scanstring('"\\u007b"', 1, None, True), scanstring('"\\u007b"', 1, True),
('{', 8)) ('{', 8))
self.assertEquals( self.assertEquals(
scanstring('"A JSON payload should be an object or array, not a string."', 1, None, True), scanstring('"A JSON payload should be an object or array, not a string."', 1, True),
('A JSON payload should be an object or array, not a string.', 60)) ('A JSON payload should be an object or array, not a string.', 60))
self.assertEquals( self.assertEquals(
scanstring('["Unclosed array"', 2, None, True), scanstring('["Unclosed array"', 2, True),
('Unclosed array', 17)) ('Unclosed array', 17))
self.assertEquals( self.assertEquals(
scanstring('["extra comma",]', 2, None, True), scanstring('["extra comma",]', 2, True),
('extra comma', 14)) ('extra comma', 14))
self.assertEquals( self.assertEquals(
scanstring('["double extra comma",,]', 2, None, True), scanstring('["double extra comma",,]', 2, True),
('double extra comma', 21)) ('double extra comma', 21))
self.assertEquals( self.assertEquals(
scanstring('["Comma after the close"],', 2, None, True), scanstring('["Comma after the close"],', 2, True),
('Comma after the close', 24)) ('Comma after the close', 24))
self.assertEquals( self.assertEquals(
scanstring('["Extra close"]]', 2, None, True), scanstring('["Extra close"]]', 2, True),
('Extra close', 14)) ('Extra close', 14))
self.assertEquals( self.assertEquals(
scanstring('{"Extra comma": true,}', 2, None, True), scanstring('{"Extra comma": true,}', 2, True),
('Extra comma', 14)) ('Extra comma', 14))
self.assertEquals( self.assertEquals(
scanstring('{"Extra value after close": true} "misplaced quoted value"', 2, None, True), scanstring('{"Extra value after close": true} "misplaced quoted value"', 2, True),
('Extra value after close', 26)) ('Extra value after close', 26))
self.assertEquals( self.assertEquals(
scanstring('{"Illegal expression": 1 + 2}', 2, None, True), scanstring('{"Illegal expression": 1 + 2}', 2, True),
('Illegal expression', 21)) ('Illegal expression', 21))
self.assertEquals( self.assertEquals(
scanstring('{"Illegal invocation": alert()}', 2, None, True), scanstring('{"Illegal invocation": alert()}', 2, True),
('Illegal invocation', 21)) ('Illegal invocation', 21))
self.assertEquals( self.assertEquals(
scanstring('{"Numbers cannot have leading zeroes": 013}', 2, None, True), scanstring('{"Numbers cannot have leading zeroes": 013}', 2, True),
('Numbers cannot have leading zeroes', 37)) ('Numbers cannot have leading zeroes', 37))
self.assertEquals( self.assertEquals(
scanstring('{"Numbers cannot be hex": 0x14}', 2, None, True), scanstring('{"Numbers cannot be hex": 0x14}', 2, True),
('Numbers cannot be hex', 24)) ('Numbers cannot be hex', 24))
self.assertEquals( self.assertEquals(
scanstring('[[[[[[[[[[[[[[[[[[[["Too deep"]]]]]]]]]]]]]]]]]]]]', 21, None, True), scanstring('[[[[[[[[[[[[[[[[[[[["Too deep"]]]]]]]]]]]]]]]]]]]]', 21, True),
('Too deep', 30)) ('Too deep', 30))
self.assertEquals( self.assertEquals(
scanstring('{"Missing colon" null}', 2, None, True), scanstring('{"Missing colon" null}', 2, True),
('Missing colon', 16)) ('Missing colon', 16))
self.assertEquals( self.assertEquals(
scanstring('{"Double colon":: null}', 2, None, True), scanstring('{"Double colon":: null}', 2, True),
('Double colon', 15)) ('Double colon', 15))
self.assertEquals( self.assertEquals(
scanstring('{"Comma instead of colon", null}', 2, None, True), scanstring('{"Comma instead of colon", null}', 2, True),
('Comma instead of colon', 25)) ('Comma instead of colon', 25))
self.assertEquals( self.assertEquals(
scanstring('["Colon instead of comma": false]', 2, None, True), scanstring('["Colon instead of comma": false]', 2, True),
('Colon instead of comma', 25)) ('Colon instead of comma', 25))
self.assertEquals( self.assertEquals(
scanstring('["Bad value", truth]', 2, None, True), scanstring('["Bad value", truth]', 2, True),
('Bad value', 12)) ('Bad value', 12))
def test_issue3623(self):
self.assertRaises(ValueError, json.decoder.scanstring, b"xxx", 1,
"xxx")
self.assertRaises(UnicodeDecodeError,
json.encoder.encode_basestring_ascii, b"xx\xff")
...@@ -4,20 +4,8 @@ import json ...@@ -4,20 +4,8 @@ import json
from collections import OrderedDict from collections import OrderedDict
class TestUnicode(TestCase): class TestUnicode(TestCase):
def test_encoding1(self): # test_encoding1 and test_encoding2 from 2.x are irrelevant (only str
encoder = json.JSONEncoder(encoding='utf-8') # is supported as input, not bytes).
u = '\N{GREEK SMALL LETTER ALPHA}\N{GREEK CAPITAL LETTER OMEGA}'
s = u.encode('utf-8')
ju = encoder.encode(u)
js = encoder.encode(s)
self.assertEquals(ju, js)
def test_encoding2(self):
u = '\N{GREEK SMALL LETTER ALPHA}\N{GREEK CAPITAL LETTER OMEGA}'
s = u.encode('utf-8')
ju = json.dumps(u, encoding='utf-8')
js = json.dumps(s, encoding='utf-8')
self.assertEquals(ju, js)
def test_encoding3(self): def test_encoding3(self):
u = '\N{GREEK SMALL LETTER ALPHA}\N{GREEK CAPITAL LETTER OMEGA}' u = '\N{GREEK SMALL LETTER ALPHA}\N{GREEK CAPITAL LETTER OMEGA}'
...@@ -52,8 +40,22 @@ class TestUnicode(TestCase): ...@@ -52,8 +40,22 @@ class TestUnicode(TestCase):
def test_unicode_decode(self): def test_unicode_decode(self):
for i in range(0, 0xd7ff): for i in range(0, 0xd7ff):
u = chr(i) u = chr(i)
js = '"\\u{0:04x}"'.format(i) s = '"\\u{0:04x}"'.format(i)
self.assertEquals(json.loads(js), u) self.assertEquals(json.loads(s), u)
def test_unicode_preservation(self):
self.assertEquals(type(json.loads('""')), str)
self.assertEquals(type(json.loads('"a"')), str)
self.assertEquals(type(json.loads('["a"]')[0]), str)
def test_bytes_encode(self):
self.assertRaises(TypeError, json.dumps, b"hi")
self.assertRaises(TypeError, json.dumps, [b"hi"])
def test_bytes_decode(self):
self.assertRaises(TypeError, json.loads, b'"hi"')
self.assertRaises(TypeError, json.loads, b'["hi"]')
def test_object_pairs_hook_with_unicode(self): def test_object_pairs_hook_with_unicode(self):
s = '{"xkd":1, "kcw":2, "art":3, "hxm":4, "qrt":5, "pad":6, "hoy":7}' s = '{"xkd":1, "kcw":2, "art":3, "hxm":4, "qrt":5, "pad":6, "hoy":7}'
......
...@@ -2,11 +2,11 @@ r"""Command-line tool to validate and pretty-print JSON ...@@ -2,11 +2,11 @@ r"""Command-line tool to validate and pretty-print JSON
Usage:: Usage::
$ echo '{"json":"obj"}' | python -mjson.tool $ echo '{"json":"obj"}' | python -m json.tool
{ {
"json": "obj" "json": "obj"
} }
$ echo '{ 1.2:3.4}' | python -mjson.tool $ echo '{ 1.2:3.4}' | python -m json.tool
Expecting property name: line 1 column 2 (char 2) Expecting property name: line 1 column 2 (char 2)
""" """
...@@ -24,7 +24,7 @@ def main(): ...@@ -24,7 +24,7 @@ def main():
infile = open(sys.argv[1], 'rb') infile = open(sys.argv[1], 'rb')
outfile = open(sys.argv[2], 'wb') outfile = open(sys.argv[2], 'wb')
else: else:
raise SystemExit("{0} [infile [outfile]]".format(sys.argv[0])) raise SystemExit(sys.argv[0] + " [infile [outfile]]")
try: try:
obj = json.load(infile) obj = json.load(infile)
except ValueError as e: except ValueError as e:
......
...@@ -107,6 +107,8 @@ Installation ...@@ -107,6 +107,8 @@ Installation
Library Library
------- -------
- The json module now works exclusively with str and not bytes.
- Issue #3959: The ipaddr module has been added to the standard library. - Issue #3959: The ipaddr module has been added to the standard library.
Contributed by Google. Contributed by Google.
......
This diff is collapsed.
Markdown is supported
0%
or
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment