Commit 08342a2d authored by Nikita Nemkin's avatar Nikita Nemkin

Pass-through single surrogates in Py_UNICODE[] literal encoding routine.

parent 2e3cf3da
...@@ -280,9 +280,9 @@ def encode_pyunicode_string(s): ...@@ -280,9 +280,9 @@ def encode_pyunicode_string(s):
else: else:
utf16, utf32 = s, [] utf16, utf32 = s, []
for code_unit in s: for code_unit in s:
if 0xDC00 <= code_unit <= 0xDFFF: # low surrogate if 0xDC00 <= code_unit <= 0xDFFF and utf32 and 0xD800 <= utf32[-1] <= 0xDBFF:
high, low = utf32.pop(), code_unit high, low = utf32[-1], code_unit
utf32.append(((high & 0x3FF) << 10) + (low & 0x3FF) + 0x10000) utf32[-1] = ((high & 0x3FF) << 10) + (low & 0x3FF) + 0x10000
else: else:
utf32.append(code_unit) utf32.append(code_unit)
......
Markdown is supported
0%
or
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment