1. 21 May, 2020 1 commit
  2. 19 May, 2020 4 commits
    • Kirill Smelkov's avatar
      golang: tests: Add tests for IPython and Pytest integration patches · 09629367
      Kirill Smelkov authored
      bb9a94c3 (golang: Teach defer to chain exceptions (PEP 3134) even on
      Python2) added integration patches for IPython and Pytest to properly
      dump tracebacks for chained exceptions even on Python2. However the
      functionality of patches was tested only manually.
      
      -> Add corresponding tests to verify how IPython and Pytest behaves
      when dumping tracebacks.
      09629367
    • Kirill Smelkov's avatar
      golang: tests: assertDoc: Include ~/... into PYGOLANG normalization · 42ab98a6
      Kirill Smelkov authored
      assertDoc normalizes paths in compared texts with the idea for etalon
      output to contain PYGOLANG instead of whatever actual path there will be
      when testing the package. This already works.
      
      However IPython, when dumping tracebacks, tries to shorten paths and
      abbreviate $HOME with ~ in them. This breaks normalization which misses
      to convert prefix of those paths into PYGOLANG.
      
      -> Fix it by teaching assertDoc to also handle paths that start with ~
      and correctly normalize them.
      
      This will be needed in the next patch where we will add tests for how
      ipython and pytest dump tracebacks for chained exceptions.
      42ab98a6
    • Kirill Smelkov's avatar
      golang: tests: Factor file-reading into readfile() utility · 2413b5ba
      Kirill Smelkov authored
      We are going to use it from several places.
      2413b5ba
    • Kirill Smelkov's avatar
      golang: tests: Factor-out path to directories into global dir_* variable · 0148cb89
      Kirill Smelkov authored
      Paths to the directories are already used in several functions, and are
      going to be used more. Move them to common place to avoid duplication.
      0148cb89
  3. 03 May, 2020 1 commit
    • Kirill Smelkov's avatar
      golang: Teach qq to be usable with both bytes and str format whatever type qq argument is · edc7aaab
      Kirill Smelkov authored
      qq is used to quote strings or byte-strings. The following example
      illustrates the problem we are currently hitting in zodbtools with
      Python3:
      
          >>> "hello %s" % qq("мир")
          'hello "мир"'
      
          >>> b"hello %s" % qq("мир")
          Traceback (most recent call last):
            File "<stdin>", line 1, in <module>
          TypeError: %b requires a bytes-like object, or an object that implements __bytes__, not 'str'
      
          >>> "hello %s" % qq(b("мир"))
          'hello "мир"'
      
          >>> b"hello %s" % qq(b("мир"))
          Traceback (most recent call last):
            File "<stdin>", line 1, in <module>
          TypeError: %b requires a bytes-like object, or an object that implements __bytes__, not 'str'
      
      i.e. one way or another if type of format string and what qq returns do not
      match it creates a TypeError.
      
      We want qq(obj) to be useable with both string and bytestring format.
      
      For that let's teach qq to return special str- and bytes- derived types that
      know how to automatically convert to str->bytes and bytes->str via b/u
      correspondingly. This way formatting works whatever types combination it was
      for format and for qq, and the whole result has the same type as format.
      
      For now we teach only qq to use new types and don't generally expose
      _str and _unicode to be returned by b and u yet. However we might do so
      in the future after incrementally gaining a bit more experience.
      
      /proposed-for-review-on: !1
      edc7aaab
  4. 29 Apr, 2020 2 commits
  5. 16 Apr, 2020 3 commits
    • Kirill Smelkov's avatar
      pygolang v0.0.6.post2 · 283a1558
      Kirill Smelkov authored
      A build fix wrt gevent-1.5 + benchmarks for nogil go and channels.
      283a1558
    • Kirill Smelkov's avatar
      Fix build for gevent-1.5 · 4d667fa3
      Kirill Smelkov authored
      Starting from gevent >= 1.5 '*.pxd' files for gevent API are no longer
      provided, at least in released gevent wheels. This broke pygolang:
      
          Error compiling Cython file:
          ------------------------------------------------------------
          ...
      
          # Gevent runtime uses gevent's greenlets and semaphores.
          # When sema.acquire() blocks, gevent switches us from current to another greenlet.
      
          IF not PYPY:
              from gevent._greenlet cimport Greenlet
             ^
          ------------------------------------------------------------
      
          golang/runtime/_runtime_gevent.pyx:28:4: 'gevent/_greenlet.pxd' not found
      
      Since gevent upstream refuses to restore Cython level access[1], let's fix the
      build by using gevent bits via Python-level.
      
      Even when used via py import gevent-1.5 brings speed improvement compared to
      gevent-1.4 (used via cimport):
      
      	(on i7@2.6GHz, gevent runtime)
      
                            gevent-1.4   gevent-1.5
                            (cimport)    (py import)
      
          name              old time/op  new time/op  delta
          pyx_select_nogil  9.47µs ± 0%  8.74µs ± 0%   -7.70%  (p=0.000 n=10+9)
          pyx_go_nogil      14.3µs ± 1%  12.0µs ± 1%  -16.52%  (p=0.000 n=10+10)
          pyx_chan_nogil    7.10µs ± 1%  6.32µs ± 1%  -10.89%  (p=0.000 n=10+10)
          go                16.0µs ± 2%  13.4µs ± 1%  -16.37%  (p=0.000 n=10+10)
          chan              7.50µs ± 0%  6.79µs ± 0%   -9.53%  (p=0.000 n=10+10)
          select            10.8µs ± 1%  10.0µs ± 1%   -6.78%  (p=0.000 n=10+10)
      
      Using gevent-1.5 could have been even faster via cimport (it is still
      possible to compile and test against gevent installed in development
      mode via `pip install -e` because pxd files are there in gevent worktree
      and tarball):
      
                            gevent-1.5   gevent-1.5
                            (py import)  (cimport)
      
          name              old time/op  new time/op  delta
          pyx_select_nogil  8.74µs ± 0%  7.90µs ± 1%  -9.60%  (p=0.000 n=9+10)
          pyx_go_nogil      12.0µs ± 1%  11.2µs ± 2%  -6.35%  (p=0.000 n=10+10)
          pyx_chan_nogil    6.32µs ± 1%  5.89µs ± 0%  -6.80%  (p=0.000 n=10+9)
          go                13.4µs ± 1%  12.4µs ± 1%  -7.54%  (p=0.000 n=10+9)
          chan              6.79µs ± 0%  6.42µs ± 0%  -5.47%  (p=0.000 n=10+10)
          select            10.0µs ± 1%   9.4µs ± 1%  -6.39%  (p=0.000 n=10+10)
      
      but we cannot use cimport to access gevent-1.5 universally, since pxd are not
      shipped in gevent wheel releases.
      
      In the future we might want to change plain version check into compile time
      check whether gevent/_greenlet.pxd is actually present or not and use faster
      access if yes. Requesting gevent to be installed in non-binary form
      might be also an option worth trying.
      
      However plain version check should be ok for now.
      
      [1] https://github.com/gevent/gevent/issues/1568
      4d667fa3
    • Kirill Smelkov's avatar
      golang: Add benchmarks for nogil go and channels · 2114a560
      Kirill Smelkov authored
      on i7@2.6GHz it looks like:
      
      thread runtime:
      
          name              time/op
          pyx_select_nogil  2.70µs ±13%
          pyx_go_nogil      15.9µs ± 1%
          pyx_chan_nogil    2.79µs ± 2%
          go                17.6µs ± 0%
          chan              3.05µs ± 4%
          select            3.62µs ± 4%
      
      gevent runtime (gevent-1.4.0):
      
          name              time/op
          pyx_select_nogil  9.39µs ± 1%
          pyx_go_nogil      15.1µs ± 2%
          pyx_chan_nogil    7.10µs ± 1%
          go                16.6µs ± 1%
          chan              7.47µs ± 1%
          select            10.7µs ± 0%
      2114a560
  6. 15 Apr, 2020 1 commit
  7. 05 Mar, 2020 1 commit
  8. 28 Feb, 2020 3 commits
    • Kirill Smelkov's avatar
      pygolang v0.0.6 · 5e1cb5ea
      Kirill Smelkov authored
      5e1cb5ea
    • Kirill Smelkov's avatar
      strconv: Fix b & friends on macos/windows · 0561926a
      Kirill Smelkov authored
      On macos and windows, Python2 is built with --enable-unicode=ucs2, which
      makes it to use UTF-16 encoding for unicode characters, and so for
      characters higher than U+10000 it uses surrogate encoding with _2_
      unicode points, for example:
      
              >>> import sys
              >>> sys.maxunicode
              65535                       <-- NOTE indicates UCS2 build
              >>> s = u'\U00012345'
              >>> s
              u'\U00012345'
              >>> s.encode('utf-8')
              '\xf0\x92\x8d\x85'
              >>> len(s)
              2                           <-- NOTE _not_ 1
              >>> s[0]
              u'\ud808'
              >>> s[1]
              u'\udf45'
      
      This leads to e.g. b tests failing for
      
          # tbytes                        tunicode
          (b"\xf0\x90\x8c\xbc",           u'\U0001033c'),     # Valid 4 Octet Sequence '𐌼'
      
          >           assert b(tunicode) == tbytes
          E           AssertionError: assert '\xed\xa0\x80\xed\xbc\xbc' == '\xf0\x90\x8c\xbc'
          E             - \xed\xa0\x80\xed\xbc\xbc
          E             + \xf0\x90\x8c\xbc
      
      because on UCS2 python build u'\U0001033c' is represented as 2 unicode
      points:
      
          >>> s = u'\U0001033c'
          >>> len(s)
          2
          >>> s[0]
          u'\ud800'
          >>> s[1]
          u'\udf3c'
          >>> s[0].encode('utf-8')
          '\xed\xa0\x80'
          >>> s[1].encode('utf-8')
          '\xed\xbc\xbc'
      
      -> Fix it by detecting UCS2 build and working around by manually
      combining such surrogate unicode pairs appropriately.
      
      A reference on the subject:
      
      https://matthew-brett.github.io/pydagogue/python_unicode.html#utf-16-ucs2-builds-of-python-and-32-bit-unicode-code-points
      0561926a
    • Kirill Smelkov's avatar
      strconv: Switch _utf8_decode_rune to return rune ordinal instead of unicode character · 5cc679ac
      Kirill Smelkov authored
      This is a preparatory step for the next patch where we'll be fixing
      strconv for Python2 builds with --enable-unicode=ucs2, where a unicode
      character can be taking _2_ unicode points.
      
      In that general case relying on unicode objects to represent runes is
      not good, because many things generally do not work for U+10000 and
      above, e.g. ord breaks:
      
          >>> import sys
          >>> sys.maxunicode
          65535                       <-- NOTE indicates UCS2 build
          >>> s = u'\U00012345'
          >>> s
          u'\U00012345'
          >>> s.encode('utf-8')
          '\xf0\x92\x8d\x85'
          >>> len(s)
          2                           <-- NOTE _not_ 1
          >>> ord(s)
          Traceback (most recent call last):
            File "<stdin>", line 1, in <module>
          TypeError: ord() expected a character, but string of length 2 found
      
      so we switch to represent runes as integer, similarly to what Go does.
      5cc679ac
  9. 27 Feb, 2020 3 commits
  10. 20 Feb, 2020 1 commit
  11. 17 Feb, 2020 1 commit
    • Kirill Smelkov's avatar
      sync += RWMutex · 1ad3c2d5
      Kirill Smelkov authored
      Provide sync.RWMutex that can be useful for cases when there are
      multiple simultaneous readers and more seldom writer(s).
      
      This implements readers-writer mutex with preference for writers
      similarly to Go version.
      1ad3c2d5
  12. 12 Feb, 2020 1 commit
  13. 11 Feb, 2020 3 commits
    • Kirill Smelkov's avatar
      errors: Take .__cause__ into account · 03f88c0b
      Kirill Smelkov authored
      A Python error can have links to other errors by means of both .Unwrap()
      and .__cause__ . These ways are both explicit and so should be treated
      by e.g. errors.Is as present in error's error chain.
      
      It is a bit unclear, at least initially, how to linearise and order
      error chain traversal in divergence points - for exception objects where
      both .Unwrap() and .__cause__ are !None. However more closer look
      suggests linearisation rule to traverse into .__cause__ after going
      through .Unwrap() part - please see details in documentation added into
      _error.pyx
      
      -> Teach errors.Is to do this traversal, and this way now e.g. exception
      raised as
      
      	raise X from Y
      
      will be treated by errors.Is as being both X and Y, even if any of X or Y
      also has its own error chain via .Unwrap().
      
      Top-level documentation is TODO.
      03f88c0b
    • Kirill Smelkov's avatar
      golang, errors, fmt: Error chaining (Python) · 337de0d7
      Kirill Smelkov authored
      Following errors model in Go and fd95c88a (golang, errors, fmt: Error
      chaining (C++/Pyx)) let's add support at Python-level for errors to wrap
      each other and to be inspected/unwrapped:
      
      - an error can additionally provide way to unwrap itself, if it
        provides .Unwrap() method. .__cause__ is not taken into account yet,
        but will be in a follow-up patch;
      - errors.Is(err) tests whether an item in error's chain matches target;
      - `fmt.Errorf("... : %w", ... err)` is similar to `"... : %s" % (..., err)`
        but resulting error, when unwrapped, will return err.
      - errors.Unwrap is not exposed as chaining through both .Unwrap() and
        .__cause__ will need more than just "current element" as unwrapping
        state (i.e. errors.Unwrap API is insufficient - see next patch), and
        in practice users of errors.Unwrap() are very seldom.
      
      Support for error chaining through .__cause__ will follow in the next
      patch.
      
      Top-level documentation is TODO.
      
      See https://blog.golang.org/go1.13-errors for error chaining overview.
      337de0d7
    • Kirill Smelkov's avatar
      golang: Teach pyerror to be a base class · 78d0c76f
      Kirill Smelkov authored
      It is surprising to have an exception class that cannot be derived from.
      
      Besides, in the future we'll use subclassing from golang.error as an
      indicator that an error is a "well-defined" (in simple words - does not
      need traceback to be interpreted).
      78d0c76f
  14. 10 Feb, 2020 1 commit
    • Kirill Smelkov's avatar
      golang: Expose error at Py level · 17798442
      Kirill Smelkov authored
      The first step to expose errors and error chaining to Python:
      
      - Add pyerror that wraps a pyx/nogil C-level error and is exposed as golang.error at py level.
      - py errors must be compared by ==, not by "is"
      - Add (py) errors.New to create a new error from text.
      - a C-level error that has .Unwrap, is exposed with .Unwrap at py level,
        but full py-level chaining will be implemented in a follow-up patch.
      - py error does not support inheritance yet.
      
      Top-level documentation is TODO.
      17798442
  15. 06 Feb, 2020 1 commit
    • Kirill Smelkov's avatar
      golang, errors, fmt: Error chaining (C++/Pyx) · fd95c88a
      Kirill Smelkov authored
      Following errors model in Go, let's add support for errors to wrap other
      errors and to be inspected/unwrapped:
      
      - an error can additionally provide way to unwrap itself, if it
        implements errorWrapper interface;
      - errors.Unwrap(err) tries to extract wrapped error;
      - errors.Is(err) tests whether an item in error's chain matches target;
      - `fmt.errorf("... : %w", ... err)` is similar to `fmt.errorf("... : %s", ... err.c_str())`
        but resulting error, when unwrapped, will return err.
      
      Add C++ implementation for the above + tests.
      Python analogs will follow in the next patches.
      
      Top-level documentation is TODO.
      
      See https://blog.golang.org/go1.13-errors for error chaining overview.
      fd95c88a
  16. 04 Feb, 2020 13 commits
    • Kirill Smelkov's avatar
      cxx: Correct dict interface · 58fcdd87
      Kirill Smelkov authored
      Package cxx was added in 9785f2d3 (cxx: New package), but the interface
      that cxx:dict provided turned out to be not optimal:
      
          dict.get  was returning (v, ok), and
          dict.pop  ----//---
      
      Correct dict.get and dict.pop to return just value, and, similarly to
      channels API, provide additional dict.get_ and dict.pop_ - extended
      versions that also return ok:
      
          dict.get(k)  -> v
          dict.pop(k)  -> v
          dict.get_(k) -> (v, ok)
          dict.pop_(k) -> (v, ok)
      
      This time add tests.
      58fcdd87
    • Kirill Smelkov's avatar
      golang: fmt.pxd -> _fmt.pxd + fmt.pxd import redirector · dbd051f1
      Kirill Smelkov authored
      Follow the scheme established and used for all other packages, because
      we will soon have fmt pyx part which, if named as fmt.pyx, will
      intersect and conflict with fmt.py .
      dbd051f1
    • Kirill Smelkov's avatar
      errors: Test for New (C++) · 288e16a7
      Kirill Smelkov authored
      errors.New was added in a245ab56 (errors: New package) without test.
      288e16a7
    • Kirill Smelkov's avatar
    • Kirill Smelkov's avatar
    • Kirill Smelkov's avatar
    • Kirill Smelkov's avatar
      golang: testing: Provide file and line for a failing ASSERT_EQ · ff2ed5fe
      Kirill Smelkov authored
      Makes understanding which test is it and where when one fails.
      ff2ed5fe
    • Kirill Smelkov's avatar
      libgolang: tests: Factor out common testing functionality into shared place · 46a6f424
      Kirill Smelkov authored
      Currently libgolang_test.cpp contains tests for code in libgolang.cpp and
      for code that lives in other libgolang packages - sync, fmt, etc. It is
      becoming tight and we are going to split libgolang_test.cpp and move
      package tests to their corresponing files - e.g. to sync_test.cpp and
      the like.
      
      Move common assertion utilities into shared header before that as a
      preparatory step.
      46a6f424
    • Kirill Smelkov's avatar
      golang: qq: Don't depend on six · e028cf28
      Kirill Smelkov authored
      Just use builtins and cimported things that we have at pyx level.
      e028cf28
    • Kirill Smelkov's avatar
      golang: qq: Use u for UTF-8 decoding · 3073ac98
      Kirill Smelkov authored
      U is preffered way to make sure an object is unicode string.
      3073ac98
    • Kirill Smelkov's avatar
      gcompat: Move qq into golang · 8c459a99
      Kirill Smelkov authored
      This will allow to integrate qq with u in the next patch.
      
      Moving to compiled code for string processing functions is also
      generally better for performance.
      8c459a99
    • Kirill Smelkov's avatar
      golang: Provide b, u for strings · bcb95cd5
      Kirill Smelkov authored
      With Python3 I've got tired to constantly use .encode() and .decode();
      getting exception if original argument was unicode on e.g. b.decode();
      getting exception on raw bytes that are invalid UTF-8, not being able to
      use bytes literal with non-ASCII characters, etc.
      
      So instead of this pain provide two functions that make sure an object
      is either bytes or unicode:
      
      - b converts str/unicode/bytes s to UTF-8 encoded bytestring.
      
      	Bytes input is preserved as-is:
      
      	   b(bytes_input) == bytes_input
      
      	Unicode input is UTF-8 encoded. The encoding always succeeds.
      	b is reverse operation to u - the following invariant is always true:
      
      	   b(u(bytes_input)) == bytes_input
      
      - u converts str/unicode/bytes s to unicode string.
      
      	Unicode input is preserved as-is:
      
      	   u(unicode_input) == unicode_input
      
      	Bytes input is UTF-8 decoded. The decoding always succeeds and input
      	information is not lost: non-valid UTF-8 bytes are decoded into
      	surrogate codes ranging from U+DC80 to U+DCFF.
      	u is reverse operation to b - the following invariant is always true:
      
      	   u(b(unicode_input)) == unicode_input
      
      NOTE: encoding _and_ decoding *never* fail nor loose information. This
      is achieved by using 'surrogateescape' error handler on Python3, and
      providing manual fallback that behaves the same way on Python2.
      
      The naming is chosen with the idea so that b(something) resembles
      b"something", and u(something) resembles u"something".
      
      This, even being only a part of strings solution discussed in [1],
      should help handle byte- and unicode- strings in more robust and
      distraction free way.
      
      Top-level documentation is TODO.
      
      [1] zodbtools!13
      bcb95cd5
    • Kirill Smelkov's avatar
      libgolang: Provide Nil as alias for std::nullptr_t · 230c81c4
      Kirill Smelkov authored
      This continues 60f6db6f (libgolang: Provide nil as alias for nullptr and
      NULL): I've tried to compile pygolang with Clang on my Debian 10
      workstation and got:
      
          $ CC=clang CXX=clang++ python setup.py build_dso -i
      
          In file included from ./golang/fmt.h:32:
          ./golang/libgolang.h:381:11: error: unknown type name 'nullptr_t'; did you mean 'std::nullptr_t'?
          constexpr nullptr_t nil = nullptr;
                    ^~~~~~~~~
                    std::nullptr_t
          /usr/bin/../lib/gcc/x86_64-linux-gnu/8/../../../../include/x86_64-linux-gnu/c++/8/bits/c++config.h:242:29: note: 'std::nullptr_t' declared here
            typedef decltype(nullptr)     nullptr_t;
                                          ^
          :
          In file included from ./golang/context.h
          In file included from golang/runtime/libgolang.cpp:30:
          ./golang/libgolang.h:381:11: error: unknown type name 'nullptr_t'; did you mean 'std::nullptr_t'?
          constexpr nullptr_t nil = nullptr;
                    ^~~~~~~~~
                    std::nullptr_t
          /usr/bin/../lib/gcc/x86_64-linux-gnu/8/../../../../include/x86_64-linux-gnu/c++/8/bits/c++config.h:242:29: note: 'std::nullptr_t' declared here
            typedef decltype(nullptr)     nullptr_t;
                                          ^
          :39:
          ./golang/libgolang.h:381:11: error: unknown type In file included from golang/fmt.cpp:25:
          In file included from ./golang/fmt.h:32:
          ./golang/libgolang.h:421:17: error: unknown type name 'nullptr_t'; did you mean 'std::nullptr_t'?
              inline chan(nullptr_t) { _ch = nil; }
                          ^~~~~~~~~
                          std::nullptr_t
      
          ...
      
      It seems with GCC and Clang under macOS nullptr_t is automatically provided in
      builtin namespace, while with older Clang on Linux (clang version 7.0.1-8) only
      in std:: namespace - rightfully as nullptr_t is described to be present there:
      
      https://en.cppreference.com/w/cpp/types/nullptr_t
      
      This way we either have to correct all occurrences of nullptr_t to
      std::nullptr_t, or do something similar with providing nil under golang:: .
      
      To reduce noise I prefer the later and let it be named as Nil.
      230c81c4