port simplejson upgrade from the trunk #4136

json also now works only with unicode strings Patch by Antoine Pitrou; updated by me

port simplejson upgrade from the trunk #4136
json also now works only with unicode strings Patch by Antoine Pitrou; updated by me
bbc85f30 · Benjamin Peterson · 9d5e41f9 · bbc85f30 · bbc85f30 · bbc85f30
Commit bbc85f30 authored May 02, 2009 by Benjamin Peterson
15 changed files
--- a/Doc/library/json.rst
+++ b/Doc/library/json.rst
@@ -112,7 +112,7 @@ Using json.tool from the shell to validate and pretty-print::
 Basic Usage
 -----------
-.. function:: dump(obj, fp[, skipkeys[, ensure_ascii[, check_circular[, allow_nan[, cls[, indent[, separators[, encoding[, default[, **kw]]]]]]]]]])
+.. function:: dump(obj, fp[, skipkeys[, ensure_ascii[, check_circular[, allow_nan[, cls[, indent[, separators[, default[, **kw]]]]]]]]]])
   Serialize *obj* as a JSON formatted stream to *fp* (a ``.write()``-supporting
   file-like object).
@@ -122,11 +122,10 @@ Basic Usage
   :class:`float`, :class:`bool`, ``None``) will be skipped instead of raising a
   :exc:`TypeError`.
-   If *ensure_ascii* is ``False`` (default: ``True``), then some chunks written
+   The :mod:`json` module always produces :class:`str` objects, not
-   to *fp* may be :class:`unicode` instances, subject to normal Python
+   :class:`bytes` objects. Therefore, ``fp.write()`` must support :class:`str`
-   :class:`str` to :class:`unicode` coercion rules.  Unless ``fp.write()``
+   input.
-   explicitly understands :class:`unicode` (as in :func:`codecs.getwriter`) this
-   is likely to cause an error.
   If *check_circular* is ``False`` (default: ``True``), then the circular
   reference check for container types will be skipped and a circular reference
@@ -146,8 +145,6 @@ Basic Usage
   will be used instead of the default ``(', ', ': ')`` separators.  ``(',',
   ':')`` is the most compact JSON representation.
-   *encoding* is the character encoding for str instances, default is UTF-8.
   *default(obj)* is a function that should return a serializable version of
   *obj* or raise :exc:`TypeError`.  The default simply raises :exc:`TypeError`.
@@ -156,26 +153,17 @@ Basic Usage
   *cls* kwarg.
-.. function:: dumps(obj[, skipkeys[, ensure_ascii[, check_circular[, allow_nan[, cls[, indent[, separators[, encoding[, default[, **kw]]]]]]]]]])
+.. function:: dumps(obj[, skipkeys[, ensure_ascii[, check_circular[, allow_nan[, cls[, indent[, separators[, default[, **kw]]]]]]]]]])
-   Serialize *obj* to a JSON formatted :class:`str`.
+   Serialize *obj* to a JSON formatted :class:`str`.  The arguments have the
+   same meaning as in :func:`dump`.
-   If *ensure_ascii* is ``False``, then the return value will be a
-   :class:`unicode` instance.  The other arguments have the same meaning as in
-   :func:`dump`.
+.. function:: load(fp[, cls[, object_hook[, parse_float[, parse_int[, parse_constant[, object_pairs_hook[, **kw]]]]]]]])
-.. function:: load(fp[, encoding[, cls[, object_hook[, parse_float[, parse_int[, parse_constant[, object_pairs_hook[, **kw]]]]]]]])
   Deserialize *fp* (a ``.read()``-supporting file-like object containing a JSON
   document) to a Python object.
-   If the contents of *fp* are encoded with an ASCII based encoding other than
-   UTF-8 (e.g. latin-1), then an appropriate *encoding* name must be specified.
-   Encodings that are not ASCII based (such as UCS-2) are not allowed, and
-   should be wrapped with ``codecs.getreader(encoding)(fp)``, or simply decoded
-   to a :class:`unicode` object and passed to :func:`loads`.
   *object_hook* is an optional function that will be called with the result of
   any object literal decode (a :class:`dict`).  The return value of
   *object_hook* will be used instead of the :class:`dict`.  This feature can be used
@@ -241,7 +229,7 @@ Encoders and decoders
   +---------------+-------------------+
   | array         | list              |
   +---------------+-------------------+
-   | string        | unicode           |
+   | string        | str               |
   +---------------+-------------------+
   | number (int)  | int               |
   +---------------+-------------------+
@@ -257,13 +245,6 @@ Encoders and decoders
   It also understands ``NaN``, ``Infinity``, and ``-Infinity`` as their
   corresponding ``float`` values, which is outside the JSON spec.
-   *encoding* determines the encoding used to interpret any :class:`str` objects
-   decoded by this instance (UTF-8 by default).  It has no effect when decoding
-   :class:`unicode` objects.
-   Note that currently only encodings that are a superset of ASCII work, strings
-   of other encodings should be passed in as :class:`unicode`.
   *object_hook*, if specified, will be called with the result of every JSON
   object decoded and its return value will be used in place of the given
   :class:`dict`.  This can be used to provide custom deserializations (e.g. to
@@ -298,20 +279,20 @@ Encoders and decoders
   .. method:: decode(s)
-      Return the Python representation of *s* (a :class:`str` or
+      Return the Python representation of *s* (a :class:`str` instance
-      :class:`unicode` instance containing a JSON document)
+      containing a JSON document)
   .. method:: raw_decode(s)
-      Decode a JSON document from *s* (a :class:`str` or :class:`unicode`
+      Decode a JSON document from *s* (a :class:`str` beginning with a
-      beginning with a JSON document) and return a 2-tuple of the Python
+      JSON document) and return a 2-tuple of the Python representation
-      representation and the index in *s* where the document ended.
+      and the index in *s* where the document ended.
      This can be used to decode a JSON document from a string that may have
      extraneous data at the end.
-.. class:: JSONEncoder([skipkeys[, ensure_ascii[, check_circular[, allow_nan[, sort_keys[, indent[, separators[, encoding[, default]]]]]]]]])
+.. class:: JSONEncoder([skipkeys[, ensure_ascii[, check_circular[, allow_nan[, sort_keys[, indent[, separators[, default]]]]]]]])
   Extensible JSON encoder for Python data structures.
@@ -324,7 +305,7 @@ Encoders and decoders
   +-------------------+---------------+
   | list, tuple       | array         |
   +-------------------+---------------+
-   | str, unicode      | string        |
+   | str               | string        |
   +-------------------+---------------+
   | int, float        | number        |
   +-------------------+---------------+
@@ -344,9 +325,9 @@ Encoders and decoders
   attempt encoding of keys that are not str, int, float or None.  If
   *skipkeys* is ``True``, such items are simply skipped.
-   If *ensure_ascii* is ``True`` (the default), the output is guaranteed to be
+   If *ensure_ascii* is ``True`` (the default), the output is guaranteed to
-   :class:`str` objects with all incoming unicode characters escaped.  If
+   have all incoming non-ASCII characters escaped.  If *ensure_ascii* is
-   *ensure_ascii* is ``False``, the output will be a unicode object.
+   ``False``, these characters will be output as-is.
   If *check_circular* is ``True`` (the default), then lists, dicts, and custom
   encoded objects will be checked for circular references during encoding to
@@ -376,10 +357,6 @@ Encoders and decoders
   otherwise be serialized.  It should return a JSON encodable version of the
   object or raise a :exc:`TypeError`.
-   If *encoding* is not ``None``, then all input strings will be transformed
-   into unicode using that encoding prior to JSON-encoding.  The default is
-   UTF-8.
   .. method:: default(o)

--- a/Lib/json/__init__.py
+++ b/Lib/json/__init__.py
--- a/Lib/json/decoder.py
+++ b/Lib/json/decoder.py
--- a/Lib/json/encoder.py
+++ b/Lib/json/encoder.py
--- a/Lib/json/scanner.py
+++ b/Lib/json/scanner.py
-"""Iterator based sre token scanner
+"""JSON token scanner
 """
 import re
-import sre_parse
+try:
-import sre_compile
+    from _json import make_scanner as c_make_scanner
-import sre_constants
+except ImportError:
+    c_make_scanner = None
-from re import VERBOSE, MULTILINE, DOTALL
+__all__ = ['make_scanner']
-from sre_constants import BRANCH, SUBPATTERN
-__all__ = ['Scanner', 'pattern']
+NUMBER_RE = re.compile(
+    r'(-?(?:0|[1-9]\d*))(\.\d+)?([eE][-+]?\d+)?',
+    (re.VERBOSE | re.MULTILINE | re.DOTALL))
-FLAGS = (VERBOSE | MULTILINE | DOTALL)
+def py_make_scanner(context):
+    parse_object = context.parse_object
+    parse_array = context.parse_array
+    parse_string = context.parse_string
+    match_number = NUMBER_RE.match
+    strict = context.strict
+    parse_float = context.parse_float
+    parse_int = context.parse_int
+    parse_constant = context.parse_constant
+    object_hook = context.object_hook
-class Scanner(object):
+    def _scan_once(string, idx):
-    def __init__(self, lexicon, flags=FLAGS):
-        self.actions = [None]
-        # Combine phrases into a compound pattern
-        s = sre_parse.Pattern()
-        s.flags = flags
-        p = []
-        for idx, token in enumerate(lexicon):
-            phrase = token.pattern
        try:
-                subpattern = sre_parse.SubPattern(s,
+            nextchar = string[idx]
-                    [(SUBPATTERN, (idx + 1, sre_parse.parse(phrase, flags)))])
+        except IndexError:
-            except sre_constants.error:
+            raise StopIteration
-                raise
-            p.append(subpattern)
-            self.actions.append(token)
-        s.groups = len(p) + 1 # NOTE(guido): Added to make SRE validation work
-        p = sre_parse.SubPattern(s, [(BRANCH, (None, p))])
-        self.scanner = sre_compile.compile(p)
-    def iterscan(self, string, idx=0, context=None):
+        if nextchar == '"':
-        """Yield match, end_idx for each match
+            return parse_string(string, idx + 1, strict)
+        elif nextchar == '{':
+            return parse_object((string, idx + 1), strict,
+                _scan_once, object_hook, object_pairs_hook)
+        elif nextchar == '[':
+            return parse_array((string, idx + 1), _scan_once)
+        elif nextchar == 'n' and string[idx:idx + 4] == 'null':
+            return None, idx + 4
+        elif nextchar == 't' and string[idx:idx + 4] == 'true':
+            return True, idx + 4
+        elif nextchar == 'f' and string[idx:idx + 5] == 'false':
+            return False, idx + 5
-        """
+        m = match_number(string, idx)
-        match = self.scanner.scanner(string, idx).match
+        if m is not None:
-        actions = self.actions
+            integer, frac, exp = m.groups()
-        lastend = idx
+            if frac or exp:
-        end = len(string)
+                res = parse_float(integer + (frac or '') + (exp or ''))
-        while True:
+            else:
-            m = match()
+                res = parse_int(integer)
-            if m is None:
+            return res, m.end()
-                break
+        elif nextchar == 'N' and string[idx:idx + 3] == 'NaN':
-            matchbegin, matchend = m.span()
+            return parse_constant('NaN'), idx + 3
-            if lastend == matchend:
+        elif nextchar == 'I' and string[idx:idx + 8] == 'Infinity':
-                break
+            return parse_constant('Infinity'), idx + 8
-            action = actions[m.lastindex]
+        elif nextchar == '-' and string[idx:idx + 9] == '-Infinity':
-            if action is not None:
+            return parse_constant('-Infinity'), idx + 9
-                rval, next_pos = action(m, context)
+        else:
-                if next_pos is not None and next_pos != matchend:
+            raise StopIteration
-                    # "fast forward" the scanner
-                    matchend = next_pos
-                    match = self.scanner.scanner(string, matchend).match
-                yield rval, matchend
-            lastend = matchend
+    return _scan_once
-def pattern(pattern, flags=FLAGS):
+make_scanner = c_make_scanner or py_make_scanner
-    def decorator(fn):
-        fn.pattern = pattern
-        fn.regex = re.compile(pattern, flags)
-        return fn
-    return decorator
--- a/Lib/json/tests/test_decode.py
+++ b/Lib/json/tests/test_decode.py
@@ -32,3 +32,10 @@ class TestDecode(TestCase):
                                    object_pairs_hook = OrderedDict,
                                    object_hook = lambda x: None),
                         OrderedDict(p))
+    def test_decoder_optimizations(self):
+        # Several optimizations were made that skip over calls to
+        # the whitespace regex, so this test is designed to try and
+        # exercise the uncommon cases. The array cases are already covered.
+        rval = json.loads('{   "key"    :    "value"    ,  "k":"v"    }')
+        self.assertEquals(rval, {"key":"value", "k":"v"})
--- a/Lib/json/tests/test_dump.py
+++ b/Lib/json/tests/test_dump.py
@@ -11,3 +11,11 @@ class TestDump(TestCase):
    def test_dumps(self):
        self.assertEquals(json.dumps({}), '{}')
+    def test_encode_truefalse(self):
+        self.assertEquals(json.dumps(
+                 {True: False, False: True}, sort_keys=True),
+                 '{"false": true, "true": false}')
+        self.assertEquals(json.dumps(
+                {2: 3.0, 4.0: 5, False: 1, 6: True}, sort_keys=True),
+                '{"false": 1, "2": 3.0, "4.0": 5, "6": true}')
--- a/Lib/json/tests/test_encode_basestring_ascii.py
+++ b/Lib/json/tests/test_encode_basestring_ascii.py
@@ -3,22 +3,20 @@ from unittest import TestCase
 import json.encoder
 CASES = [
-    ('/\\"\ucafe\ubabe\uab98\ufcde\ubcda\uef4a\x08\x0c\n\r\t`1~!@#$%^&*()_+-=[]{}|;:\',./<>?', b'"/\\\\\\"\\ucafe\\ubabe\\uab98\\ufcde\\ubcda\\uef4a\\b\\f\\n\\r\\t`1~!@#$%^&*()_+-=[]{}|;:\',./<>?"'),
+    ('/\\"\ucafe\ubabe\uab98\ufcde\ubcda\uef4a\x08\x0c\n\r\t`1~!@#$%^&*()_+-=[]{}|;:\',./<>?', '"/\\\\\\"\\ucafe\\ubabe\\uab98\\ufcde\\ubcda\\uef4a\\b\\f\\n\\r\\t`1~!@#$%^&*()_+-=[]{}|;:\',./<>?"'),
-    ('\u0123\u4567\u89ab\ucdef\uabcd\uef4a', b'"\\u0123\\u4567\\u89ab\\ucdef\\uabcd\\uef4a"'),
+    ('\u0123\u4567\u89ab\ucdef\uabcd\uef4a', '"\\u0123\\u4567\\u89ab\\ucdef\\uabcd\\uef4a"'),
-    ('controls', b'"controls"'),
+    ('controls', '"controls"'),
-    ('\x08\x0c\n\r\t', b'"\\b\\f\\n\\r\\t"'),
+    ('\x08\x0c\n\r\t', '"\\b\\f\\n\\r\\t"'),
-    ('{"object with 1 member":["array with 1 element"]}', b'"{\\"object with 1 member\\":[\\"array with 1 element\\"]}"'),
+    ('{"object with 1 member":["array with 1 element"]}', '"{\\"object with 1 member\\":[\\"array with 1 element\\"]}"'),
-    (' s p a c e d ', b'" s p a c e d "'),
+    (' s p a c e d ', '" s p a c e d "'),
-    ('\U0001d120', b'"\\ud834\\udd20"'),
+    ('\U0001d120', '"\\ud834\\udd20"'),
-    ('\u03b1\u03a9', b'"\\u03b1\\u03a9"'),
+    ('\u03b1\u03a9', '"\\u03b1\\u03a9"'),
-    (b'\xce\xb1\xce\xa9', b'"\\u03b1\\u03a9"'),
+    ('\u03b1\u03a9', '"\\u03b1\\u03a9"'),
-    ('\u03b1\u03a9', b'"\\u03b1\\u03a9"'),
+    ('\u03b1\u03a9', '"\\u03b1\\u03a9"'),
-    (b'\xce\xb1\xce\xa9', b'"\\u03b1\\u03a9"'),
+    ('\u03b1\u03a9', '"\\u03b1\\u03a9"'),
-    ('\u03b1\u03a9', b'"\\u03b1\\u03a9"'),
+    ("`1~!@#$%^&*()_+-={':[,]}|;.</>?", '"`1~!@#$%^&*()_+-={\':[,]}|;.</>?"'),
-    ('\u03b1\u03a9', b'"\\u03b1\\u03a9"'),
+    ('\x08\x0c\n\r\t', '"\\b\\f\\n\\r\\t"'),
-    ("`1~!@#$%^&*()_+-={':[,]}|;.</>?", b'"`1~!@#$%^&*()_+-={\':[,]}|;.</>?"'),
+    ('\u0123\u4567\u89ab\ucdef\uabcd\uef4a', '"\\u0123\\u4567\\u89ab\\ucdef\\uabcd\\uef4a"'),
-    ('\x08\x0c\n\r\t', b'"\\b\\f\\n\\r\\t"'),
-    ('\u0123\u4567\u89ab\ucdef\uabcd\uef4a', b'"\\u0123\\u4567\\u89ab\\ucdef\\uabcd\\uef4a"'),
 ]
 class TestEncodeBaseStringAscii(TestCase):
@@ -26,12 +24,14 @@ class TestEncodeBaseStringAscii(TestCase):
        self._test_encode_basestring_ascii(json.encoder.py_encode_basestring_ascii)
    def test_c_encode_basestring_ascii(self):
-        if json.encoder.c_encode_basestring_ascii is not None:
+        if not json.encoder.c_encode_basestring_ascii:
+            return
        self._test_encode_basestring_ascii(json.encoder.c_encode_basestring_ascii)
    def _test_encode_basestring_ascii(self, encode_basestring_ascii):
        fname = encode_basestring_ascii.__name__
        for input_string, expect in CASES:
            result = encode_basestring_ascii(input_string)
-            result = result.encode("ascii")
+            self.assertEquals(result, expect,
-            self.assertEquals(result, expect)
+                '{0!r} != {1!r} for {2}({3!r})'.format(
+                    result, expect, fname, input_string))
--- a/Lib/json/tests/test_fail.py
+++ b/Lib/json/tests/test_fail.py
@@ -73,4 +73,4 @@ class TestFail(TestCase):
            except ValueError:
                pass
            else:
-                self.fail("Expected failure for fail%d.json: %r" % (idx, doc))
+                self.fail("Expected failure for fail{0}.json: {1!r}".format(idx, doc))
--- a/Lib/json/tests/test_float.py
+++ b/Lib/json/tests/test_float.py
@@ -5,5 +5,11 @@ import json
 class TestFloat(TestCase):
    def test_floats(self):
-        for num in [1617161771.7650001, math.pi, math.pi**100, math.pi**-100]:
+        for num in [1617161771.7650001, math.pi, math.pi**100, math.pi**-100, 3.1]:
            self.assertEquals(float(json.dumps(num)), num)
+            self.assertEquals(json.loads(json.dumps(num)), num)
+    def test_ints(self):
+        for num in [1, 1<<32, 1<<64]:
+            self.assertEquals(json.dumps(num), str(num))
+            self.assertEquals(int(json.dumps(num)), num)
--- a/Lib/json/tests/test_scanstring.py
+++ b/Lib/json/tests/test_scanstring.py
@@ -15,96 +15,90 @@ class TestScanString(TestCase):
    def _test_scanstring(self, scanstring):
        self.assertEquals(
-            scanstring('"z\\ud834\\udd20x"', 1, None, True),
+            scanstring('"z\\ud834\\udd20x"', 1, True),
            ('z\U0001d120x', 16))
        if sys.maxunicode == 65535:
            self.assertEquals(
-                scanstring('"z\U0001d120x"', 1, None, True),
+                scanstring('"z\U0001d120x"', 1, True),
                ('z\U0001d120x', 6))
        else:
            self.assertEquals(
-                scanstring('"z\U0001d120x"', 1, None, True),
+                scanstring('"z\U0001d120x"', 1, True),
                ('z\U0001d120x', 5))
        self.assertEquals(
-            scanstring('"\\u007b"', 1, None, True),
+            scanstring('"\\u007b"', 1, True),
            ('{', 8))
        self.assertEquals(
-            scanstring('"A JSON payload should be an object or array, not a string."', 1, None, True),
+            scanstring('"A JSON payload should be an object or array, not a string."', 1, True),
            ('A JSON payload should be an object or array, not a string.', 60))
        self.assertEquals(
-            scanstring('["Unclosed array"', 2, None, True),
+            scanstring('["Unclosed array"', 2, True),
            ('Unclosed array', 17))
        self.assertEquals(
-            scanstring('["extra comma",]', 2, None, True),
+            scanstring('["extra comma",]', 2, True),
            ('extra comma', 14))
        self.assertEquals(
-            scanstring('["double extra comma",,]', 2, None, True),
+            scanstring('["double extra comma",,]', 2, True),
            ('double extra comma', 21))
        self.assertEquals(
-            scanstring('["Comma after the close"],', 2, None, True),
+            scanstring('["Comma after the close"],', 2, True),
            ('Comma after the close', 24))
        self.assertEquals(
-            scanstring('["Extra close"]]', 2, None, True),
+            scanstring('["Extra close"]]', 2, True),
            ('Extra close', 14))
        self.assertEquals(
-            scanstring('{"Extra comma": true,}', 2, None, True),
+            scanstring('{"Extra comma": true,}', 2, True),
            ('Extra comma', 14))
        self.assertEquals(
-            scanstring('{"Extra value after close": true} "misplaced quoted value"', 2, None, True),
+            scanstring('{"Extra value after close": true} "misplaced quoted value"', 2, True),
            ('Extra value after close', 26))
        self.assertEquals(
-            scanstring('{"Illegal expression": 1 + 2}', 2, None, True),
+            scanstring('{"Illegal expression": 1 + 2}', 2, True),
            ('Illegal expression', 21))
        self.assertEquals(
-            scanstring('{"Illegal invocation": alert()}', 2, None, True),
+            scanstring('{"Illegal invocation": alert()}', 2, True),
            ('Illegal invocation', 21))
        self.assertEquals(
-            scanstring('{"Numbers cannot have leading zeroes": 013}', 2, None, True),
+            scanstring('{"Numbers cannot have leading zeroes": 013}', 2, True),
            ('Numbers cannot have leading zeroes', 37))
        self.assertEquals(
-            scanstring('{"Numbers cannot be hex": 0x14}', 2, None, True),
+            scanstring('{"Numbers cannot be hex": 0x14}', 2, True),
            ('Numbers cannot be hex', 24))
        self.assertEquals(
-            scanstring('[[[[[[[[[[[[[[[[[[[["Too deep"]]]]]]]]]]]]]]]]]]]]', 21, None, True),
+            scanstring('[[[[[[[[[[[[[[[[[[[["Too deep"]]]]]]]]]]]]]]]]]]]]', 21, True),
            ('Too deep', 30))
        self.assertEquals(
-            scanstring('{"Missing colon" null}', 2, None, True),
+            scanstring('{"Missing colon" null}', 2, True),
            ('Missing colon', 16))
        self.assertEquals(
-            scanstring('{"Double colon":: null}', 2, None, True),
+            scanstring('{"Double colon":: null}', 2, True),
            ('Double colon', 15))
        self.assertEquals(
-            scanstring('{"Comma instead of colon", null}', 2, None, True),
+            scanstring('{"Comma instead of colon", null}', 2, True),
            ('Comma instead of colon', 25))
        self.assertEquals(
-            scanstring('["Colon instead of comma": false]', 2, None, True),
+            scanstring('["Colon instead of comma": false]', 2, True),
            ('Colon instead of comma', 25))
        self.assertEquals(
-            scanstring('["Bad value", truth]', 2, None, True),
+            scanstring('["Bad value", truth]', 2, True),
            ('Bad value', 12))
-    def test_issue3623(self):
-        self.assertRaises(ValueError, json.decoder.scanstring, b"xxx", 1,
-                          "xxx")
-        self.assertRaises(UnicodeDecodeError,
-                          json.encoder.encode_basestring_ascii, b"xx\xff")
--- a/Lib/json/tests/test_unicode.py
+++ b/Lib/json/tests/test_unicode.py
@@ -4,20 +4,8 @@ import json
 from collections import OrderedDict
 class TestUnicode(TestCase):
-    def test_encoding1(self):
+    # test_encoding1 and test_encoding2 from 2.x are irrelevant (only str
-        encoder = json.JSONEncoder(encoding='utf-8')
+    # is supported as input, not bytes).
-        u = '\N{GREEK SMALL LETTER ALPHA}\N{GREEK CAPITAL LETTER OMEGA}'
-        s = u.encode('utf-8')
-        ju = encoder.encode(u)
-        js = encoder.encode(s)
-        self.assertEquals(ju, js)
-    def test_encoding2(self):
-        u = '\N{GREEK SMALL LETTER ALPHA}\N{GREEK CAPITAL LETTER OMEGA}'
-        s = u.encode('utf-8')
-        ju = json.dumps(u, encoding='utf-8')
-        js = json.dumps(s, encoding='utf-8')
-        self.assertEquals(ju, js)
    def test_encoding3(self):
        u = '\N{GREEK SMALL LETTER ALPHA}\N{GREEK CAPITAL LETTER OMEGA}'
@@ -52,8 +40,22 @@ class TestUnicode(TestCase):
    def test_unicode_decode(self):
        for i in range(0, 0xd7ff):
            u = chr(i)
-            js = '"\\u{0:04x}"'.format(i)
+            s = '"\\u{0:04x}"'.format(i)
-            self.assertEquals(json.loads(js), u)
+            self.assertEquals(json.loads(s), u)
+    def test_unicode_preservation(self):
+        self.assertEquals(type(json.loads('""')), str)
+        self.assertEquals(type(json.loads('"a"')), str)
+        self.assertEquals(type(json.loads('["a"]')[0]), str)
+    def test_bytes_encode(self):
+        self.assertRaises(TypeError, json.dumps, b"hi")
+        self.assertRaises(TypeError, json.dumps, [b"hi"])
+    def test_bytes_decode(self):
+        self.assertRaises(TypeError, json.loads, b'"hi"')
+        self.assertRaises(TypeError, json.loads, b'["hi"]')
    def test_object_pairs_hook_with_unicode(self):
        s = '{"xkd":1, "kcw":2, "art":3, "hxm":4, "qrt":5, "pad":6, "hoy":7}'

--- a/Lib/json/tool.py
+++ b/Lib/json/tool.py
@@ -2,11 +2,11 @@ r"""Command-line tool to validate and pretty-print JSON
 Usage::
-    $ echo '{"json":"obj"}' | python -mjson.tool
+    $ echo '{"json":"obj"}' | python -m json.tool
    {
        "json": "obj"
    }
-    $ echo '{ 1.2:3.4}' | python -mjson.tool
+    $ echo '{ 1.2:3.4}' | python -m json.tool
    Expecting property name: line 1 column 2 (char 2)
 """
@@ -24,7 +24,7 @@ def main():
        infile = open(sys.argv[1], 'rb')
        outfile = open(sys.argv[2], 'wb')
    else:
-        raise SystemExit("{0} [infile [outfile]]".format(sys.argv[0]))
+        raise SystemExit(sys.argv[0] + " [infile [outfile]]")
    try:
        obj = json.load(infile)
    except ValueError as e:

--- a/Misc/NEWS
+++ b/Misc/NEWS
@@ -107,6 +107,8 @@ Installation
 Library
 -------
+- The json module now works exclusively with str and not bytes.
 - Issue #3959: The ipaddr module has been added to the standard library.
  Contributed by Google.

--- a/Modules/_json.c
+++ b/Modules/_json.c