python: Add 2/3 compat wrappers for byte strings
Introduce some helpers for managing bytes/unicode objects in a way that bridges the gap from python2 to 3. 1. Add printb() helper for writing bytes output directly to stdout. This avoids complaints from print() in python3, which expects a unicode str(). Since python 3.5, `b"" % bytes()` style format strings should work and we can write tools with common code, once we convert format strings to bytes. http://legacy.python.org/dev/peps/pep-0461/ 2. Add a class for wrapping command line arguments that are intended for comparing to debugged memory, for instance running process COMM or kernel pathname data. The approach takes some of the discussion from http://legacy.python.org/dev/peps/pep-0383/ into account, though unfortunately the python2-future implementation of "surrogateescape" is buggy, therefore this iteration is partial. The object instance should only ever be coerced into a bytes object. This silently invokes encode(sys.getfilesystemencoding()), which if it fails implies that the tool was passed junk characters on the command line. Thereafter the tool should implement only bytes-bytes comparisons (for instance re.search(b"", b"")) and bytes stdout printing (see printb). 3. Add an _assert_is_bytes helper to check for improper usage of str objects in python arguments. The behavior of the assertion can be tweaked by changing the bcc.utils._strict_bytes bool. Going forward, one should never invoke decode() on a bpf data stream, e.g. the result of a table lookup or perf ring output. Leave that data in the native bytes() representation. Signed-off-by: Brenden Blanco <bblanco@gmail.com>
Showing
Please register or sign in to comment