• Kirill Smelkov's avatar
    wcfs: client: Adjust Cython part to accept both bytes and str input, and yield bstr output · 69aab23a
    Kirill Smelkov authored
    wcfs/client/_wcfs.pyx provides Cython wrapper over C++ WCFS client that
    works with bytes-based std::string messages. On py2 everything works ok,
    but on py3, due to this, it rejects str given as input argument, e.g. as follows:
    
        ```python
        _____________________________ test_join_autostart ______________________________
    
            @func
            def test_join_autostart():
                zurl = testzurl
                with raises(RuntimeError, match="wcfs: join .*: server not running"):
                    wcfs.join(zurl, autostart=False)
    
                assert wcfs._wcregistry == {}
                def _():
                    assert wcfs._wcregistry == {}
                defer(_)
    
        >       wc = wcfs.join(zurl, autostart=True)
    
        wcfs/wcfs_test.py:164:
        _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _
        wcfs/__init__.py:225: in join
            wc = WCFS(mntpt, fwcfs, wcsrv)
        ../../venvs/wendelin.core/lib/python3.9/site-packages/decorator.py:232: in fun
            return caller(func, *(extras + args), **kw)
        ../pygolang/golang/__init__.py:125: in _
            return f(*argv, **kw)
        wcfs/__init__.py:167: in __init__
            wc.mountpoint = mountpoint
        wcfs/client/_wcfs.pyx:44: in wendelin.wcfs.client._wcfs.PyWCFS.mountpoint.__set__
            def __set__(PyWCFS pywc, string v):
        _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _
    
        >   ???
        E   TypeError: expected bytes, str found
        ```
    
    because by default Cython treats std::string as related to bytes on py side.
    
    -> Fix it by accepting both str and bytes as input for all methods
       arguments related to strings.
    
    For returned strings care to return them as strings, not bytes, to which
    Cython converts std::string by default because calling code expects
    returned messages to have string semantic. Though we return the data as
    bytestring, not unicode, as the rest of the testsuite also assumes
    binary messages reception from WCFS server.
    
    NOTE even though it was me to originally suggest in private to use
    
        cython: c_string_type=str, c_string_encoding=utf8
    
    so that str type is accepted as input, later, when having a broader
    look, I realized that there are two problems with the above directives.
    First the directives affect not only the input, but also any std::string
    returned becomes returned as unicode instead of bytes/bytestr previously.
    However as explained above the higher level expects binary semantic from
    returned messages. And second if WCFS sends a message with invalid UTF-8
    data, it will result in exception thrown on the client instead of
    actually returning sent data to the caller. This makes debugging more
    difficult and last thing I want to happen is, when WCFS sends some
    garbage, to get a UnicodeDecodeError instead of actually seeing the message
    and higher level assert saying that that message is unexpected with
    providing details.
    
    So do all the in- and out- conversions by hand instead with controlling
    desired semantics ourselves.
    
    On py3 the implementation depends on nexedi/pygolang!21,
    but on py2 it works both with and without pygolang bstr patches.
    
    Preliminary history:
    
        vnmabus/wendelin.core@47c27b03Co-authored-by: Carlos Ramos Carreño's avatarCarlos Ramos Carreño <carlos.ramos@nexedi.com>
    69aab23a
_wcfs.pyx 9.85 KB