When decoding UTF-16, don't assume that the buffer is in native endianness

when checking surrogates.

When decoding UTF-16, don't assume that the buffer is in native endianness
when checking surrogates.
ac93bc25 · Martin v. Löwis · 208efe56 · ac93bc25
Commit ac93bc25 authored Jun 26, 2001 by Martin v. Löwis
Show whitespace changes
Inline Side-by-side

Showing with 4 additions and 4 deletions

Objects/unicodeobject.c Objects/unicodeobject.c +4 -4

No files found.
--- a/Objects/unicodeobject.c
+++ b/Objects/unicodeobject.c
@@ -1065,16 +1065,16 @@ PyObject *PyUnicode_DecodeUTF16(const char *s,
 	    errmsg = "unexpected end of data";
 	    goto utf16Error;
 	}
-	if (0xDC00 <= *q && *q <= 0xDFFF) {
+	if (0xD800 <= ch && ch <= 0xDBFF) {
 	    Py_UCS2 ch2 = *q++;
 #ifdef BYTEORDER_IS_LITTLE_ENDIAN
 	    if (bo == 1)
-		    ch = (ch >> 8) | (ch << 8);
+		    ch2 = (ch2 >> 8) | (ch2 << 8);
 #else    
 	    if (bo == -1)
-		    ch = (ch >> 8) | (ch << 8);
+		    ch2 = (ch2 >> 8) | (ch2 << 8);
 #endif
-	    if (0xD800 <= ch && ch <= 0xDBFF) {
+	    if (0xDC00 <= ch2 && ch2 <= 0xDFFF) {
 #if Py_UNICODE_SIZE == 2
 		/* This is valid data (a UTF-16 surrogate pair), but
 		   we are not able to store this information since our