README.rst · 04be919bc050908eba0bca393f26385353cbc597 · Kirill Smelkov / pygolang

golang_str: bstr/ustr index access · 04be919b

Kirill Smelkov authored Oct 07, 2022

Implement access to bstr/ustr by [index] and by slice. Result of such
[index] access - similarly to standard str - returns the same bstr/ustr
type with one character:

  - ustr[i] returns ustr with one unicode character taken from i'th character of original string, while
  - bstr[i] returns bstr with one byte taken from i'th byte of original bytestring.

This follows str/unicode semantics on both py2/py3, bytes semantic on
py2, but diverges from bytes semantics on py3. I originally tried to
follow bytes/py3 semantic - for bstr to return an integer instead of
1-byte character, but later found several compatibility breakages due to
it. I contemplated about this divergence for a long time and finally
took decision to follow strings semantics for both ustr and bstr. This
preserves backward compatibility with Python2 and also allows for bstr
to be practically drop-in replacement for str type.

To get an ordinal corresponding to retrieved character, one can use
standard `ord`, e.g. as in `ord(bstr[i])`. This will always return an
integer for all bstr/ustr/str/unicode. Similarly to standard `chr` and
`unichr`, we also provide two utility functions - `uchr` and `bbyte` to
create 1-character and 1-byte ustr/bstr correspondingly.

04be919b

README.rst 19.9 KB

Replace README.rst