-
Kirill Smelkov authored
Starting from 2013 (from 555efd8f "first draft of dumb pickle encoder") wrt []byte ogórek state was: 1. []byte was encoded as string 2. there was no way to decode a pickle object into []byte then, as - []byte encoding was never explicitly tested, - nor I could find any usage of such encodings via searching through all Free / Open-Source software ogórek uses - I searched via "Uses" of NewEncoder on godoc: https://sourcegraph.com/github.com/kisielk/og-rek/-/blob/encode.go#L48:6=&tab=references:external it is likely that []byte encoding support was added just for the sake of it and convenience and then never used. It is also likely that the original author does not use ogórek encoder anymore: https://github.com/kisielk/og-rek/pull/52#issuecomment-423639026 For those reasons I tend to think that it should be relatively safe to change how []byte is handled: - the need to change []byte handling is that currently []byte is a kind of exception: we can only encode it and not decode something into it. Currently encode/decode roundtrip for []byte gives string, which breaks the property of encode/decode being identity for all other basic types. - on the similar topic, on encoding strings are assumed UTF-8 and are encoded as UNICODE opcodes for protocol >= 3. Passing arbitrary bytes there seems to be not good. - on to how change []byte - sadly it cannot be used as Python's bytes counterpart. In fact in the previous patch we actually just added ogórek.Bytes as a counterpart for Python bytes. We did not used []byte for that because - contrary to Python bytes - []byte cannot be used as a dict key. - the most natural counterpart for Go's []byte is thus Python's bytearray: https://docs.python.org/3/library/stdtypes.html#bytearray-objects which is "a mutable counterpart to bytes objects" So add Python's bytearray decoding into []byte, and correspondingly change []byte encoding to be encoded as bytearray. P.S. This changes encoder semantic wrt []byte. If some ogórek use breaks somewhere because of it, we could add an encoder option to restore backward compatible behaviour. However since I suspect noone was actually encoding []byte into pickles, I prefer we wait for someone to speak-up first instead of loading EncoderConfig with confusion options that nobody will ever use.
30ebda02