decoder: Don't treat \r\n as combined EOL
Currently we use bufio.Reader.ReadLine which accepts either \n or \r\n as line ending. That is however not correct: - we should not accept e.g. "S'abc'\r\n." pickle, because it is invalid: In [32]: pickle.loads(b"S'abc'\r\n.") --------------------------------------------------------------------------- UnpicklingError Traceback (most recent call last) <ipython-input-32-b1da1988bae1> in <module>() ----> 1 pickle.loads(b"S'abc'\r\n.") UnpicklingError: the STRING opcode argument must be quoted - we should not accept e.g. "L123L\r\n.", because it is also invalid: In [33]: pickle.loads(b"L123L\r\n.") --------------------------------------------------------------------------- ValueError Traceback (most recent call last) <ipython-input-33-7231ec07f5c4> in <module>() ----> 1 pickle.loads(b"L123L\r\n.") ValueError: invalid literal for int() with base 10: '123L\r\n' - treating \r as part of EOL in e.g. UNICODE pickle would just drop encoded information: # python In [34]: pickle.loads(b"Vabc\r\n.") Out[34]: 'abc\r' while ogórek currently decodes it as just 'abc' (no trailing \r). For this reason let's fix Decoder.readLine to treat only \n as EOL. Besides this fix, we now get another property: previously, when internally using bufio.Reader.ReadLine we were not able to distinguish two situations: - a line was abruptly ended without any EOL characters at all, - a line was properly ended with EOL character. Now after we switched to internally using bufio.Reader.ReadSlice, we will be able to properly detect EOF and return that as error. This property will be needed in the following patch.
Showing
Please register or sign in to comment