Commits · 33ea7769b3b36f3c8b35d46a65346fad9c710029 · Levin Zimmermann / wendelin.core

17 Sep, 2024 14 commits

wcfs: tests: Move client to be pinkill'ed into separate process · 33ea7769

Kirill Smelkov authored Sep 16, 2024

If we don't the whole testing process will become killed when wcfs
becomes taught to kill clients that do not handle pin notifications
well.

Use multiprocessing to do so and to be able to interoperate with spawned
test process by sending/receiving objects to/from it.

Preliminary history:

    levin.zimmermann/wendelin.core@aef0f0e1Co-authored-by: Levin Zimmermann <levin.zimmermann@nexedi.com>

/discussed-on nexedi/wendelin.core!18

33ea7769

wcfs: tests: Fix thinko in "sleep > wcfs pin timeout - wcfs must kill us" · 1303799e

Kirill Smelkov authored Sep 16, 2024

If wcfs kills client that did not respond to pin notification in
pintimeout time, we need to wait strictly _more_ than that time to detect
whether client was killed or not. And in practice, due to noise in
operating system load and other factors, that waiting time should be
significantly greater to detect lack of expected event. However we were
waiting for exactly 1·pintimeout time and were claiming that there was
no pinkill event right after that.

-> Wait for 2·pintimeout instead of 1·pintimeout to make pinkill detection robust.

/reviewed-by @levin.zimmermann
/reviewed-on nexedi/wendelin.core!18

1303799e

wcfs: tests: Use small "pin timeout" for faulty protection tests · e8a3f34a

Kirill Smelkov authored Sep 16, 2024

The default "pin timeout" is 30s and we are going to add many tests that
exercise pinkilling functionality soon. If every such test takes
2·pintimeout time = 60s, it will result in significant time increase
needed to run WCFS tests. Avoid that by adjusting pin timeout to
one order of magnitude smaller pintimeout=3s during faulty protection
tests.

/reviewed-by @levin.zimmermann
/reviewed-on nexedi/wendelin.core!18

e8a3f34a

wcfs: tests: Add context to tWCFS · 869e597d

Kirill Smelkov authored Sep 16, 2024

This testing helper limits whole test time to detect FUSE-related
deadlocks via aborting FUSE connection on timeout. It is working good so
far. But soon we will need pinkill-related tests, where timeout will
need to be detected independently of FUSE connection. Expose tWCFS.ctx
for tests to be able to use this context and do things limited in time.
Adjust FUSE aborting to correlate exactly with this context
cancellation.

/reviewed-by @levin.zimmermann
/reviewed-on nexedi/wendelin.core!18

869e597d

wcfs: tests: Move test for verifying protection against faulty/slow clients to dedicated file · ab38f971

Kirill Smelkov authored Sep 16, 2024

We are going to add more tests on this topic + supporting infrastructure.
It makes sense to move everything related to dedicated test file first
as a preparatory step because wcfs_test.py feels already overloaded.

Plain code movement.

/reviewed-by @levin.zimmermann
/reviewed-on nexedi/wendelin.core!18

ab38f971

wcfs: Fix setupWatch vs setupWatch race on the same file · 64468d47

Kirill Smelkov authored Sep 15, 2024

WCFS allows issuing simultaneous watch requests and when two watch
requests are simultaneously issued for the same file there was a race in
their handling: the code was relying on w.atMu.W to protect setupWatch
from concurrent readPinWatcher, and also, seemingly from another
setupWatch running on the same file.

But there is a bug about that: lacking atomic primitive to downgrade
RWMutex from wlock to rlock, atMu.W was first fully unlocked and then
rlocked again. The code prepare wrt readPinWatcher to start running in
that unlock->rlock time window, but it was not prepared wrt another
setupWatch starting to run on the same file in that pause time.

-> Fix that via using dedicated Watch.setupMu lock that protects
   setupWatch from setupWatch.

Test is, hopefully, TODO.

My mistake from 6f0cdaff (wcfs: Provide isolation to clients)

/reviewed-by @levin.zimmermann
/reviewed-on !18

64468d47

wcfs: Fix readPinWatchers error path · 7bbd6177

Kirill Smelkov authored Sep 15, 2024

Inside readPinWatchers:

    https://lab.nexedi.com/nexedi/wendelin.core/-/blob/wendelin.core-2.0.alpha3-26-g79e6f7b9/wcfs/wcfs.go#L1536-1591

if δFtail.BlkRevAt would return an error, then f.watchMu was not
RUnlocked back, and wg.Wait was not called at all.

-> Fix that by scheduling unlock and wg wait right after f.watchMu is
   rlocked and workgroup is created.

Test is, hopefully, TODO.

My mistake from 6f0cdaff (wcfs: Provide isolation to clients)

/reviewed-by @levin.zimmermann
/reviewed-on !18

7bbd6177

wcfs: Cleanup wlinkTab entry when client drops opened head/watch handle · b20a26cb

Kirill Smelkov authored Sep 15, 2024

The code was already behaving like that but there was XXX to do it. Add
test to verify it is actually done.

Opened WatchLink handle is released after RELEASE because
read in WatchLink.serve, after RELEASE, returns EOF and then the code
inside WCFS does all necessary WatchLink-related cleanup:

https://lab.nexedi.com/nexedi/wendelin.core/-/blob/wendelin.core-2.0.alpha3-26-g79e6f7b9/wcfs/wcfs.go#L1828-1872

/reviewed-by @levin.zimmermann
/reviewed-on !18

b20a26cb

wcfs: Cleanup zheadSockTab entry when client drops opened .wcfs/zhead handle · 87818b0d

Kirill Smelkov authored Sep 15, 2024

This was marked as TODO in server code and not implemented.
Without this cleanup zheadSockTab was growing indefinitely after every
open/close and leaking memory.

-> Fix it via registering RELEASE handler to FUSE and removing
corresponding zheadSockTab entry from there.

/reviewed-by @levin.zimmermann
/reviewed-on !18

87818b0d

wcfs: Add .wcfs/stats file with basic usage statistics · 8abfd27d

Kirill Smelkov authored Sep 15, 2024

Report there number of inside-WCFS instances, e.g. number of tracked
BigFiles, WatchLinks etc, and also number of counted events, for example
how many times a pin event happened.

Soon we will need this statistics to implement tests e.g. for pinkilling
and other functionalities, and it might be also useful to have in general.

/reviewed-by @levin.zimmermann
/reviewed-on !18

8abfd27d

wcfs: Fix wlinkTab locking · 96b216f6

Kirill Smelkov authored Sep 15, 2024

ZWatcher says it does not need to lock wlinkMu because it is already
holding zheadMu and setupWatch runs with zheadMu locked. That is indeed
true, but the mistake here is that it i not only setupWatch that makes
access to wlinkTab. For example WatchNode.Open registers new entries
there only under wlinkMu:

https://lab.nexedi.com/nexedi/wendelin.core/-/blob/wendelin.core-2.0.alpha3-26-g79e6f7b9/wcfs/wcfs.go#L1819-1822

-> Fix it by always using wlinkMu when accessing wlinkTab.

My mistake from 6f0cdaff (wcfs: Provide isolation to clients)

Test is, hopefully, TODO.

/reviewed-by @levin.zimmermann
/reviewed-on !18

96b216f6

wcfs: Switch debug.zheadSockTab to fine-grained locking · 82359abe

Kirill Smelkov authored Sep 15, 2024

Previously we were protecting access to zheadSockTab with zheadMu
because this table was accessed from only two places: when opening
.wcfs/zhead and in zwatcher. Soon we are going to add another place that
will access this table and still using big zheadMu seem less and less
logical.

-> Switch to using dedicated lock to protect table of .wcfs/zhead opens
   as preparatory step for that.

/reviewed-by @levin.zimmermann
/reviewed-on nexedi/wendelin.core!18

82359abe

wcfs: Switch filesystem to EIO mode on zwatcher failure · a36b5562

Kirill Smelkov authored Sep 15, 2024

Currently zwatcher failure leads to wcfs starting to provide stale data
instead of uptodate data. Fix that by detecting zwatcher failures and
explicitly switching the filesystem to a mode where any access to
anything returns "input/output error".

Zwatcher can fail on e.g. failure to retrieve transactions from ZODB
storage or any other failure. With this patch we make sure this does not
go unnoticed.

/reviewed-by @levin.zimmermann
/reviewed-on nexedi/wendelin.core!18

a36b5562

wcfs: Remove TODO to teach go-fuse about Init.MaxPages · 6dfcb69e

Kirill Smelkov authored Sep 15, 2024

go-fuse added functionality to handle Init.MaxPages in
https://github.com/hanwen/go-fuse/commit/265a39266958.

/reviewed-by @levin.zimmermann
/reviewed-on nexedi/wendelin.core!18

6dfcb69e

23 Jul, 2024 3 commits

lib/zodb: Drop client-only parameters from normalized NEO URI · 79e6f7b9

Levin Zimmermann authored Jul 19, 2024

We need to drop client-specific options so that NEO URI that only differ
due to client options while actually pointing to the same NEO server
are equal after normalization.

--------
kirr: See nexedi/neoppod!18 for
the discussion on this subject.

/reviewed-by @kirr
/reviewed-on nexedi/wendelin.core!28

79e6f7b9

lib/zodb: Update NEO URI format to be in sync with upstream NEO · 2c0968e4

Levin Zimmermann authored Jul 19, 2024

NEO/go and NEO/py URI format diverged over time:

- neo@8c974485

However with nexedi/neoppod!21 a
common solution was found. With neo!7 NEO/go and NEO/py
URI formats are in sync again. We therefore now need to update
'wendelin.core' to support the finally agreed on URI format.

/reviewed-by @kirr
/reviewed-on nexedi/wendelin.core!28

2c0968e4

wcfs: Update NEO/go to sync URI format · 921ad362

Levin Zimmermann authored Jul 22, 2024

With neo@95572d6a we synchronized
NEO/go URI format with NEO/py URI format. We need this new
NEO/go version to apply this synchronization to 'wendelin.core'
ZODB tools (what we'll do in the next patches).

/reviewed-by @kirr
/reviewed-on nexedi/wendelin.core!28

921ad362

22 Jul, 2024 1 commit

bigfile/zodb: Apply auto format as default only in WCFS mode · 34309058

Kirill Smelkov authored Jul 22, 2024

This semantically reverts 99f262dd (bigfile/zodb: Make auto format the
default) for wendelin.core-1 mode because in non-WCFS mode there are
known problems with data corruption on BTree topology changes(*) and
auto mode actually does change those topologies with first setting
ZBigFile[blk] -> ZBlk1 and then updating the same block to point to
ZBlk0 object.

Avoid pressuring those problems and use auto as default only in WCFS
mode that should handle invalidations with all those BTree topology
changes well.

The patch is based on suggestion by Levin Zimmermann: nexedi/wendelin.core!20 (comment 212405)

We have to move _default_use_wcfs because now it is invoked at module
import time and needs to be already defined at the time of the call.

(*) see nexedi/wendelin.core@8c32c9f6 for details.

/reviewed-by @levin.zimmermann
/reviewed-on nexedi/wendelin.core!29

34309058

25 Jun, 2024 6 commits

wcfs: _mntpt_4zurl: Fix it to accept strings. · 07087ec8

Carlos Ramos Carreño authored Jun 24, 2024

Strings cannot be directly hashed without encoding them first, or
an error will be raised:

```python
______________________________ test_zsync_resync _______________________________

    @func
    def test_zsync_resync():
        zstor = testdb.getZODBStorage()
        defer(zstor.close)

>       db, zconn, wconn = _zsync_setup(zstor)

wcfs/client/_wczsync_test.py:112:
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _
../../venvs/wendelin.core/lib/python3.9/site-packages/decorator.py:232: in fun
    return caller(func, *(extras + args), **kw)
../pygolang/golang/__init__.py:125: in _
    return f(*argv, **kw)
wcfs/client/_wczsync_test.py:53: in _zsync_setup
    wc = wcfs.join(zurl)
wcfs/__init__.py:201: in join
    mntpt = _mntpt_4zurl(zurl)
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _

zurl = 'file:///srv/slapgrid/slappart66/tmp/testdb_fs.xstpbg49/1.fs'

    def _mntpt_4zurl(zurl):
        # normalize zurl so that even if we have e.g. two neos:// urls coming
        # with different paths to ssl keys, or with different order in the list of
        # masters, we still have them associated with the same wcfs mountpoint.
        zurl = zurl_normalize_main(zurl)

        m = hashlib.sha1()
>       m.update(zurl)
E       TypeError: Strings must be encoded before hashing
```

We fix this error by encoding the string as UTF8 before hashing it.

--------
kirr:

Use b instead of doing

    if isinstance(zurl, six.text_type):
      zurl = zurl.encode("utf-8")

wcfs already takes this approach of using b in other places - for
example in tDB.change:

    # change schedules zf to be changed according to changeDelta at commit.
    #
    # changeDelta: {} blk -> data.
    # data can be both bytes and unicode.              <-- NOTE
    def change(t, zf, changeDelta):
        assert isinstance(zf, ZBigFile)
        zfDelta = t._changed.setdefault(zf, {})
        for blk, data in six.iteritems(changeDelta):
            data = b(data)                             <-- NOTE
            ...

/reviewed-by @kirr
/reviewed-on nexedi/wendelin.core!27

07087ec8

wcfs: tests: Adapt changed modules/methods to Python 3. · 594ff3fa

Carlos Ramos Carreño authored Jun 24, 2024

Some modules and methods have changed names in Python 3.
The `thread` module has been renamed to `_thread` and the old name
gives error when run on Python 3:

```python
Traceback:
/opt/slapgrid/b0df76c24a1d2728ccf3e276f07c1790/parts/python3/lib/python3.9/importlib/__init__.py:127: in import_module
    return _bootstrap._gcd_import(name[level:], package, level)
wcfs/client/client_test.py:32: in <module>
    from wendelin.wcfs.wcfs_test import tDB, tAt, timeout, eprint
wcfs/wcfs_test.py:44: in <module>
    from thread import get_ident as gettid
E   ModuleNotFoundError: No module named 'thread'
```

In a similar vein, the `items` method of dictionaries plays the same
role as the old `iteritems`.

We use the `six` module to paper over these differences.

/reviewed-by @kirr
/reviewed-on nexedi/wendelin.core!27

594ff3fa

wcfs: tests: xbtree.py: Execute `zip` eagerly when we need list. · d014045b

Carlos Ramos Carreño authored Jun 24, 2024

The builtin `zip` in Python 3 returns an iterator, not a list.
Thus, one cannot directly use the `len` method on the object returned
by `zip`, or we will have errors like the following one:

```python
Traceback (most recent call last):
  File "/srv/slapgrid/slappart66/git/wendelin.core/wcfs/internal/xbtree/xbtreetest/treegen.py", line 617, in <module>
    main()
  File "/srv/slapgrid/slappart66/git/wendelin.core/wcfs/internal/xbtree/xbtreetest/treegen.py", line 613, in main
    cmd(argv)
  File "/srv/slapgrid/slappart66/venvs/wendelin.core/lib/python3.9/site-packages/decorator.py", line 232, in fun
    return caller(func, *(extras + args), **kw)
  File "/srv/slapgrid/slappart66/git/pygolang/golang/__init__.py", line 125, in _
    return f(*argv, **kw)
  File "/srv/slapgrid/slappart66/git/wendelin.core/wcfs/internal/xbtree/xbtreetest/treegen.py", line 589, in cmd_trees
    TreesSrv(zstor, r)
  File "/srv/slapgrid/slappart66/venvs/wendelin.core/lib/python3.9/site-packages/decorator.py", line 232, in fun
    return caller(func, *(extras + args), **kw)
  File "/srv/slapgrid/slappart66/git/pygolang/golang/__init__.py", line 125, in _
    return f(*argv, **kw)
  File "/srv/slapgrid/slappart66/git/wendelin.core/wcfs/internal/xbtree/xbtreetest/treegen.py", line 234, in TreesSrv
    treetxtPrev = zctx.ztreetxt(ztree)
  File "/srv/slapgrid/slappart66/venvs/wendelin.core/lib/python3.9/site-packages/decorator.py", line 232, in fun
    return caller(func, *(extras + args), **kw)
  File "/srv/slapgrid/slappart66/git/pygolang/golang/__init__.py", line 125, in _
    return f(*argv, **kw)
  File "/srv/slapgrid/slappart66/git/wendelin.core/wcfs/internal/xbtree/xbtreetest/treegen.py", line 536, in ztreetxt
    return zctx.TopoEncode(xbtree.StructureOf(ztree))
  File "/srv/slapgrid/slappart66/venvs/wendelin.core/lib/python3.9/site-packages/decorator.py", line 232, in fun
    return caller(func, *(extras + args), **kw)
  File "/srv/slapgrid/slappart66/git/pygolang/golang/__init__.py", line 125, in _
    return f(*argv, **kw)
  File "/srv/slapgrid/slappart66/git/wendelin.core/wcfs/internal/xbtree/xbtreetest/treegen.py", line 542, in TopoEncode
    return xbtree.TopoEncode(tree, zctx.vencode)
  File "/srv/slapgrid/slappart66/git/wendelin.core/wcfs/internal/xbtree.py", line 797, in TopoEncode
    for nodev in _walkBFS(tree):
  File "/srv/slapgrid/slappart66/git/wendelin.core/wcfs/internal/xbtree.py", line 701, in _walkBFS
    for level in __walkBFS(tree):
  File "/srv/slapgrid/slappart66/git/wendelin.core/wcfs/internal/xbtree.py", line 724, in __walkBFS
    assert len(rv) == len(rn.node.children)
TypeError: object of type 'zip' has no len()
```

Thus, we have to create a list from the result of `zip` before calling
`len` on it.

--------
kirr:

There were only two places where zip was used to build a list. All other
places where zip is used - both in wcfs/xbtree and in other packages -
are calling zip to iterate over zip result:

    (py39.venv) kirr@deca:~/src/wendelin/wendelin.core$ git grep -w zip
    bigarray/__init__.py:        for n, s in zip(self.shape, self.stridev):
    bigarray/__init__.py:        for n, s in zip(a.shape, a.strides):
    bigarray/array_zodb.py:BigArray_defaults = dict(zip(reversed(_.args), reversed(_.defaults)))
    wcfs/internal/xbtree.py:            for i, (klo, khi) in enumerate(zip(v[:-1], v[1:])): # (klo, khi) = [] of (k_i, k_{i+1})
    wcfs/internal/xbtree.py:                kvv = ['%s:%s' % (k,v) for (k,v) in zip(b.keyv, b.valuev)]
    wcfs/internal/xbtree.py:        for (j,i) in zip(jv, iv):
    wcfs/internal/xbtree.py:                    for (child, k) in zip(node.children[1:], node.keyv):
    wcfs/internal/xbtree.py:                    for (k,v) in zip(node.keyv, node.valuev):
    wcfs/internal/xbtree.py:            for (xlo, xhi) in zip(ksplitv[:-1], ksplitv[1:]): # (klo, s1), (s1, s2), ..., (sN, khi)
    wcfs/internal/xbtree.py:            for (xlo, xhi) in zip(ksplitv[:-1], ksplitv[1:]): # (klo, s1), (s1, s2), ..., (sN, khi)
    wcfs/internal/xbtree.py:                                    for (k,vtxt) in zip(node.keyv, vtxtv)])
    wcfs/internal/xbtree/xbtreetest/treegen.py:                    for (k,v) in zip(node.keyv, node.valuev):
    wcfs/internal/xbtree_test.py:    for (child, childOK) in zip(kids, children):
    wcfs/internal/xbtree_test.py:        for (i,(k,v)) in enumerate(zip(keys, values)):

    # handled in hereby patch
    wcfs/internal/xbtree.py:                rv = list(zip(v[:-1], v[1:]))  # (klo,k1), (k1,k2), ..., (kN,khi)
    wcfs/internal/xbtree.py:                rv = list(zip(v[:-1], v[1:]))  # (klo,k1), (k1,k2), ..., (kN,khi)

/reviewed-by @kirr
/reviewed-on nexedi/wendelin.core!27

d014045b

bigarray: tests: Do not use `numpy.object`. · 84d3d775

Carlos Ramos Carreño authored Jun 24, 2024

`numpy.object` was an alias for the builtin `object`, so we can use
`object` instead:

```python
_________________________ test_bigarray_noobject[tRAM] _________________________

testbig = <bigarray.tests.test_basic.tRAM object at 0x7f6d114ead60>

    def test_bigarray_noobject(testbig):
        Zh = testbig.fopen()

        # NOTE str & unicode are fixed-size types - if size is not explicitly given
        # it will become S0 or U0
>       obj_dtypev = [numpy.object, 'O', 'i4, O', [('x', 'i4'), ('y', 'i4, O')]]

bigarray/tests/test_basic.py:110:
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _

attr = 'object'

    def __getattr__(attr):
        # Warn for expired attributes, and return a dummy function
        # that always raises an exception.
        import warnings
        import math
        try:
            msg = __expired_functions__[attr]
        except KeyError:
            pass
        else:
            warnings.warn(msg, DeprecationWarning, stacklevel=2)

            def _expired(*args, **kwds):
                raise RuntimeError(msg)

            return _expired

        # Emit warnings for deprecated attributes
        try:
            val, msg = __deprecated_attrs__[attr]
        except KeyError:
            pass
        else:
            warnings.warn(msg, DeprecationWarning, stacklevel=2)
            return val

        if attr in __future_scalars__:
            # And future warnings for those that will change, but also give
            # the AttributeError
            warnings.warn(
                f"In the future `np.{attr}` will be defined as the "
                "corresponding NumPy scalar.", FutureWarning, stacklevel=2)

        if attr in __former_attrs__:
>           raise AttributeError(__former_attrs__[attr])
E           AttributeError: module 'numpy' has no attribute 'object'.
E           `np.object` was a deprecated alias for the builtin `object`. To avoid this error in existing code, use `object` by itself. Doing this will not modify any behavior and is safe.
E           The aliases was originally deprecated in NumPy 1.20; for more details and guidance see the original release note at:
E               https://numpy.org/devdocs/release/1.20.0-notes.html#deprecations
```

--------
kirr:

On py2:

    In [1]: import numpy

    In [2]: numpy.__version__
    Out[2]: '1.16.6'

    In [3]: numpy.object
    Out[3]: object

    In [4]: numpy.object is object
    Out[4]: True

this change is, thus, indeed safe to make.

/reviewed-by @kirr
/reviewed-on nexedi/wendelin.core!27

84d3d775

lib/zodb: tests: Do not import NEO globally. · b8a98631

Carlos Ramos Carreño authored Jun 24, 2024

NEO is still not ported to Python 3.
Importing NEO globally thus makes pytest tests fail during the
assert-rewritting step:
```python
../../venvs/wendelin.core/lib/python3.9/site-packages/_pytest/assertion/rewrite.py:178: in exec_module
    exec(co, module.__dict__)
lib/tests/test_zodb.py:36: in <module>
    from neo.client.Storage import Storage as NEOStorage
../neoppod/neo/client/__init__.py:52: in <module>
    from . import app # set up signal handlers early enough to do it in the main thread
E     File "/srv/slapgrid/slappart66/git/neoppod/neo/client/app.py", line 356
E       except NEOStorageReadRetry, e:
E                                 ^
E   SyntaxError: invalid syntax
```

A MR adding enough support to not fail at import time is proposed in
nexedi/neoppod!24 .
However, that MR will not be reviewed until the vacation period is over.

In the meantime, and as a previous step to make running NEO tests
optional, the import has been moved inside the function loading NEO.
Thus, only the tests that require NEO will fail.

--------
kirr:

Add TODO to revert to import NEO globally after lab.nexedi.com/nexedi/neoppod/-/merge_requests/24 is landed.

/reviewed-by @kirr
/reviewed-on nexedi/wendelin.core!27

b8a98631

setup: Install `ZEO[test]` when testing. · d640c701

Carlos Ramos Carreño authored Jun 24, 2024

`ZEO[test]` should be installed when testing, so that `zope.testing` is
installed.
Otherwise, an import error may be raised when running the test if
`zope.testing` has not been manually installed.

--------
kirr:

Adjust tox.ini to no longer install zope.testing explicitly since we are
now requesting ZEO[test] under our own tests extra, and tox.ini installs
.[test] as the primary package.

/reviewed-by @kirr
/reviewed-on nexedi/wendelin.core!27

d640c701

21 Jun, 2024 1 commit

*: Do not use relative imports · 3846997b

Kirill Smelkov authored Jun 20, 2024

Because of the way wendelin.core organizes its in-tree python importing
redirector (see wendelin.py) it is possible to import the same module
twice with python thinking it is importing two different modules. For
example when installed in develop mode python resolves the following
imports to the same bigfile/__init__.py

    import wendelin.bigfile
    import bigfile

but tries to load that module twice and independently. Which leads to
virtmem DSO, linked to from under bigfile/_bigfile extension, being
initialized twice and complaining about that because only single gil
hook should be requested to be installed:

    (py39.venv) kirr@deca:~/src/wendelin/wendelin.core$ python
    Python 3.9.19+ (heads/3.9:40d77b93672, Apr 12 2024, 06:40:05)
    [GCC 12.2.0] on linux
    Type "help", "copyright", "credits" or "license" for more information.
    >>> import wendelin.bigfile
    >>> import bigfile
    python: bigfile/virtmem.c:106: virt_lock_hookgil: Assertion `!(virtmem_gilhooks)' failed.
    Аварийный останов

This problem was there from day 1, but it was not creating issues in
practice because wendelin.core users do `import wendelin...` and there
was also no problem with running pytest in the source tree.

However with py39 and pytest8 we see that running pytest somehow started
to unconditionally import things from under two namespaces which leads
to inability to run tests even when instructing pytest to collect them
via python-modules namespace instead of filesystem:

    (py39.venv) kirr@deca:~/src/wendelin/wendelin.core$ pytest -vsx --pyargs wendelin.bigfile.tests.test_basic
    ======================== test session starts ========================
    platform linux -- Python 3.9.19+, pytest-8.2.2, pluggy-1.5.0 -- /home/kirr/src/wendelin/venv/py39.venv/bin/python3.9
    cachedir: .pytest_cache
    rootdir: /home/kirr/src/wendelin/wendelin.core
    configfile: pyproject.toml
    collecting ... python3.9: bigfile/virtmem.c:106: virt_lock_hookgil: Assertion `!(virtmem_gilhooks)' failed.
    Fatal Python error: Aborted

    Current thread 0x00007fa172a60740 (most recent call first):
      File "<frozen importlib._bootstrap>", line 228 in _call_with_frames_removed
      File "<frozen importlib._bootstrap_external>", line 1173 in create_module
      File "<frozen importlib._bootstrap>", line 565 in module_from_spec
      File "<frozen importlib._bootstrap>", line 666 in _load_unlocked
      File "<frozen importlib._bootstrap>", line 986 in _find_and_load_unlocked
      File "<frozen importlib._bootstrap>", line 1007 in _find_and_load
      File "/home/kirr/src/wendelin/wendelin.core/bigfile/__init__.py", line 31 in <module>
      File "<frozen importlib._bootstrap>", line 228 in _call_with_frames_removed
      File "<frozen importlib._bootstrap_external>", line 850 in exec_module
      File "<frozen importlib._bootstrap>", line 680 in _load_unlocked
      File "<frozen importlib._bootstrap>", line 986 in _find_and_load_unlocked
      File "<frozen importlib._bootstrap>", line 1007 in _find_and_load
      File "<frozen importlib._bootstrap>", line 1030 in _gcd_import
      File "<frozen importlib._bootstrap>", line 228 in _call_with_frames_removed
      File "<frozen importlib._bootstrap>", line 972 in _find_and_load_unlocked
      File "<frozen importlib._bootstrap>", line 1007 in _find_and_load
      File "<frozen importlib._bootstrap>", line 1030 in _gcd_import
      File "<frozen importlib._bootstrap>", line 228 in _call_with_frames_removed
      File "<frozen importlib._bootstrap>", line 972 in _find_and_load_unlocked
      File "<frozen importlib._bootstrap>", line 1007 in _find_and_load
      File "<frozen importlib._bootstrap>", line 1030 in _gcd_import
      File "/home/kirr/local/py3.9/lib/python3.9/importlib/__init__.py", line 127 in import_module
      File "/home/kirr/src/wendelin/venv/py39.venv/lib/python3.9/site-packages/_pytest/pathlib.py", line 591 in import_path
      File "/home/kirr/src/wendelin/venv/py39.venv/lib/python3.9/site-packages/_pytest/python.py", line 492 in importtestmodule
      ...
      File "/home/kirr/src/wendelin/venv/py39.venv/lib/python3.9/site-packages/_pytest/runner.py", line 567 in collect_one_node
      File "/home/kirr/src/wendelin/venv/py39.venv/lib/python3.9/site-packages/_pytest/main.py", line 837 in _collect_one_node
      File "/home/kirr/src/wendelin/venv/py39.venv/lib/python3.9/site-packages/_pytest/main.py", line 974 in genitems
      File "/home/kirr/src/wendelin/venv/py39.venv/lib/python3.9/site-packages/_pytest/main.py", line 811 in perform_collect
      File "/home/kirr/src/wendelin/venv/py39.venv/lib/python3.9/site-packages/_pytest/main.py", line 349 in pytest_collection
      ...
      File "/home/kirr/src/wendelin/venv/py39.venv/lib/python3.9/site-packages/_pytest/config/__init__.py", line 178 in main
      File "/home/kirr/src/wendelin/venv/py39.venv/lib/python3.9/site-packages/_pytest/config/__init__.py", line 206 in console_main
      File "/home/kirr/src/wendelin/venv/py39.venv/bin/pytest", line 8 in <module>
    Аварийный останов (образ памяти сброшен на диск)

This happens because wendelin.bigfile is importing
wendelin.bigfile._bigfile as `from ._bigfile import ...` which under
pytest leads to importing both wendelin.bigfile._bigfile and
bigfile._bigfile and further conflicting when setting up GIL hooks.

-> Fix this issue by avoiding relative imports and always referring to
   wendelin.core modules with `wendelin.` prefix.

The list of places where relative imports were used was small and found via

    $ git grep -w import |grep '\s\.'
    bigfile/__init__.py:from ._bigfile import BigFile, WRITEOUT_STORE, WRITEOUT_MARKSTORED, ram_reclaim
    wcfs/__init__.py:from .client._wcfs import \

Everywhere else we were already importing things from under wendelin
namespace via fully specified module path.

After the fix both

    $ pytest -vsx --pyargs wendelin.bigfile.tests.test_basic

and

    $ pytest -vsx bigfile/tests/test_basic.py

start to work ok from inside the worktree.

/reported-and-tested-by @vnmabus
/reviewed-by @levin.zimmermann
/reviewed-on nexedi/wendelin.core!26

3846997b

07 Jun, 2024 1 commit

setup: Allow editable wheels. · 37cf1383

Carlos Ramos Carreño authored Jun 06, 2024

When building an editable wheel it is not necessary that
`build_packages` (or even `run`) is called before calling `get_outputs`
(notice the following in
https://setuptools.pypa.io/en/latest/userguide/extension.html#supporting-sdists-and-editable-installs-in-build-sub-commands :
"Please note that custom sub-commands SHOULD NOT rely on `run()` being
executed (or not) to provide correct return values for `get_outputs()`,
`get_output_mapping()` or `get_source_files()`. The `get_*` methods
should work independently of `run().").

Our implementation relied in the call to `build_packages` to set the
name of the synthetic init file.
This commit uses a property of the object instead, to compute that name
whenever it is necessary.
With this change, it is now possible to make editable wheels.

--------
kirr:

Without the fix `pip install -e` fails as follows on py3:

  Traceback (most recent call last):
    File "/home/kirr/src/wendelin/venv/py39.venv/lib/python3.9/site-packages/setuptools/command/editable_wheel.py", line 155, in run
      self._create_wheel_file(bdist_wheel)
    File "/home/kirr/src/wendelin/venv/py39.venv/lib/python3.9/site-packages/setuptools/command/editable_wheel.py", line 357, in _create_wheel_file
      files, mapping = self._run_build_commands(dist_name, unpacked, lib, tmp)
    File "/home/kirr/src/wendelin/venv/py39.venv/lib/python3.9/site-packages/setuptools/command/editable_wheel.py", line 281, in _run_build_commands
      files, mapping = self._collect_build_outputs()
    File "/home/kirr/src/wendelin/venv/py39.venv/lib/python3.9/site-packages/setuptools/command/editable_wheel.py", line 266, in _collect_build_outputs
      files.extend(cmd.get_outputs() or [])
    File "<string>", line 137, in get_outputs
    File "/home/kirr/src/wendelin/venv/py39.venv/lib/python3.9/site-packages/setuptools/command/build_py.py", line 78, in __getattr__
      return orig.build_py.__getattr__(self, attr)
    File "/home/kirr/src/wendelin/venv/py39.venv/lib/python3.9/site-packages/setuptools/_distutils/cmd.py", line 107, in __getattr__
      raise AttributeError(attr)
  AttributeError: initfile

/reviewed-by @kirr
/reviewed-on nexedi/wendelin.core!25

37cf1383

31 May, 2024 2 commits

setup: Fix `wendelin_cy_searh_in_dirs` for Cython 3. · 17deca45

Carlos Ramos Carreño authored May 31, 2024

The interface of the function `search_include_directories` has changed
in Cython 3.0a7 in https://github.com/cython/cython/commit/f3f7b612.
This updates the replacement used by wendelin so that it works for
both newer and older versions.

Note that wendelin.core still does not work with Cython >= 3, as that
version refuses to compile Python functions that can throw C++
exceptions (apparently, mixing C++ exceptions and Cython-generated
code is not considered safe).

/reviewed-by @kirr
/reviewed-on !24

17deca45

nxdtest: Don't test on NEO for Python 3. · 5578aacd

Carlos Ramos Carreño authored May 29, 2024

NEO is still not ported to Python 3, so Python 3 tests should not use this backend.

/reviewed-by @kirr
/reviewed-on nexedi/wendelin.core!24

5578aacd

03 Apr, 2024 2 commits

bigfile/zodb: Make auto format the default · 99f262dd

Levin Zimmermann authored Jan 25, 2024

If a user doesn't explicitly declare a ZBlk format, it can be assumed
that this user wants to have the best ratio between consumed storage space
and data access speed. Currently the best ratio between these two is
provided by the new 'auto' (heuristic) format. In case of small appends this
format helps reducing storage space, and in any other case it just
behaves like ZBlk0 [1]. Therefore this default ensures a fast access speed [2],
but also avoids a massive data growth in case of many small appends [3].

[1] An exception to this is: in its current implementation a block
behaves like ZBlk1 (slow access) in case it isn't fully filled up yet.

[2] As this was stated as a reason why ZBlk1 as a default format was
reverted in nexedi/wendelin.core@0b68f178.

[3] This was perhaps the reason why ZBlk1 was set to be the default format
in nexedi/wendelin.core@9ae42085. The massive
storage space consumption can already be a problem with few array to
which regularly small data is appended to, as it can easily happen with
Wendelin development instances.

/reviewed-by @kirr
/reviewed-on nexedi/wendelin.core!20

99f262dd

bigfile/zodb: Add ZBlk format option 'auto' (heuristic) · d6628427

Levin Zimmermann authored Oct 25, 2023

There are two formats to save data with a ZBigFile: ZBlk0 and ZBlk1.
They differ by adjusting the ratio between access-time and growing
disk-space, where ZBlk1 is better regarding to disk space, while ZBlk0
has a better access-time. Wendelin.core users may not always know yet or
care which format fits better for their data. In this case it may be
easier for users to just let the program automatically select the ZBlk
format. With this patch and the new 'auto' (for heuristic) option of the
'ZBlk' argument of ZBigFile, this is now possible. The 'auto' option isn't
really a new ZBlk format in itself, but it just tries to automatically
select the best ZBlk format option according to the characteristics
of the changes that the user applies to the ZBigFile.

In its current implementation, the heuristic tackles the use-case of
large arrays with many small append-only changes. In this case 'auto' is
smaller in space than ZBlk0, but faster to read than ZBlk1. It does so,
by initially using ZBlk1 until a blk is filled up. Once a blk is full,
it switches to ZBlk1, as it was recommended by @kirr in
nexedi/wendelin.core!20 (comment 196084).

With this patch comes a test (bigfile/tests/bench_zblkfmt) that creates
benchmarks for different combinations and zblk formats. The test aims
to check how the 'heuristic' format performs in contrast to 'ZBlk0'
and 'ZBlk1':

    BenchmarkAppendSize/zblk=ZBlk0/change_count=500/change_percentage_set=[0.014]   1       538.1 MB
    BenchmarkAppendRandRead/zblk=ZBlk0/change_count=500/change_percentage_set=[0.014]       6 2.085 ms/blk
    BenchmarkAppendSize/zblk=ZBlk1/change_count=500/change_percentage_set=[0.014]   1       16.8 MB
    BenchmarkAppendRandRead/zblk=ZBlk1/change_count=500/change_percentage_set=[0.014]       6 14.564 ms/blk
    BenchmarkAppendSize/zblk=auto/change_count=500/change_percentage_set=[0.014]    1       29.4 MB
    BenchmarkAppendRandRead/zblk=auto/change_count=500/change_percentage_set=[0.014]        6 2.119 ms/blk
    BenchmarkRandWriteSize/zblk=ZBlk0/arrsize=1000000/change_count=500/change_percentage_set=[0.2]  1       1021.1 MB
    BenchmarkRandWriteRandRead/zblk=ZBlk0/arrsize=1000000/change_count=500/change_percentage_set=[0.2]      3 2.324 ms/blk
    BenchmarkRandWriteSize/zblk=ZBlk1/arrsize=1000000/change_count=500/change_percentage_set=[0.2]  1       216.2 MB
    BenchmarkRandWriteRandRead/zblk=ZBlk1/arrsize=1000000/change_count=500/change_percentage_set=[0.2]      3 15.317 ms/blk
    BenchmarkRandWriteSize/zblk=auto/arrsize=1000000/change_count=500/change_percentage_set=[0.2]   1       219.8 MB
    BenchmarkRandWriteRandRead/zblk=auto/arrsize=1000000/change_count=500/change_percentage_set=[0.2]       3 14.027 ms/blk
    BenchmarkRandWriteSize/zblk=ZBlk0/arrsize=1000000/change_count=500/change_percentage_set=[1]    1       1048.6 MB
    BenchmarkRandWriteRandRead/zblk=ZBlk0/arrsize=1000000/change_count=500/change_percentage_set=[1]        3 2.126 ms/blk
    BenchmarkRandWriteSize/zblk=ZBlk1/arrsize=1000000/change_count=500/change_percentage_set=[1]    1       1070.4 MB
    BenchmarkRandWriteRandRead/zblk=ZBlk1/arrsize=1000000/change_count=500/change_percentage_set=[1]        3 14.284 ms/blk
    BenchmarkRandWriteSize/zblk=auto/arrsize=1000000/change_count=500/change_percentage_set=[1]     1       1070.3 MB
    BenchmarkRandWriteRandRead/zblk=auto/arrsize=1000000/change_count=500/change_percentage_set=[1] 3 14.072 ms/blk
    BenchmarkRandWriteSize/zblk=ZBlk0/arrsize=1000000/change_count=500/change_percentage_set=[0.2,1]        1       1046.4 MB
    BenchmarkRandWriteRandRead/zblk=ZBlk0/arrsize=1000000/change_count=500/change_percentage_set=[0.2,1]    3 2.137 ms/blk
    BenchmarkRandWriteSize/zblk=ZBlk1/arrsize=1000000/change_count=500/change_percentage_set=[0.2,1]        1       638.2 MB
    BenchmarkRandWriteRandRead/zblk=ZBlk1/arrsize=1000000/change_count=500/change_percentage_set=[0.2,1]    3 14.083 ms/blk
    BenchmarkRandWriteSize/zblk=auto/arrsize=1000000/change_count=500/change_percentage_set=[0.2,1] 1       639.5 MB
    BenchmarkRandWriteRandRead/zblk=auto/arrsize=1000000/change_count=500/change_percentage_set=[0.2,1]     3 13.937 ms/blk

and post-processed with benchstat from 3 such runs:

                                                                                            │     x.log     │
                                                                                            │       B       │
    AppendSize/zblk=ZBlk0/change_count=500/change_percentage_set=[0.014]                       513.2Mi ± 0%
    AppendSize/zblk=ZBlk1/change_count=500/change_percentage_set=[0.014]                       16.02Mi ± 0%
    AppendSize/zblk=auto/change_count=500/change_percentage_set=[0.014]                        28.04Mi ± 0%
    RandWriteSize/zblk=ZBlk0/arrsize=1000000/change_count=500/change_percentage_set=[0.2]      973.8Mi ± 0%
    RandWriteSize/zblk=ZBlk1/arrsize=1000000/change_count=500/change_percentage_set=[0.2]      206.2Mi ± 0%
    RandWriteSize/zblk=auto/arrsize=1000000/change_count=500/change_percentage_set=[0.2]       209.6Mi ± 0%
    RandWriteSize/zblk=ZBlk0/arrsize=1000000/change_count=500/change_percentage_set=[1]       1000.0Mi ± 0%
    RandWriteSize/zblk=ZBlk1/arrsize=1000000/change_count=500/change_percentage_set=[1]       1020.8Mi ± 0%
    RandWriteSize/zblk=auto/arrsize=1000000/change_count=500/change_percentage_set=[1]        1020.7Mi ± 0%
    RandWriteSize/zblk=ZBlk0/arrsize=1000000/change_count=500/change_percentage_set=[0.2,1]    997.9Mi ± 0%
    RandWriteSize/zblk=ZBlk1/arrsize=1000000/change_count=500/change_percentage_set=[0.2,1]    608.6Mi ± 0%
    RandWriteSize/zblk=auto/arrsize=1000000/change_count=500/change_percentage_set=[0.2,1]     609.9Mi ± 0%
    geomean                                                                                    353.0Mi

                                                                                                │    x.log    │
                                                                                                │   ms/blk    │
    AppendRandRead/zblk=ZBlk0/change_count=500/change_percentage_set=[0.014]                      2.094 ± 12%
    AppendRandRead/zblk=ZBlk1/change_count=500/change_percentage_set=[0.014]                      14.47 ±  1%
    AppendRandRead/zblk=auto/change_count=500/change_percentage_set=[0.014]                       2.168 ±  2%
    RandWriteRandRead/zblk=ZBlk0/arrsize=1000000/change_count=500/change_percentage_set=[0.2]     2.324 ±  1%
    RandWriteRandRead/zblk=ZBlk1/arrsize=1000000/change_count=500/change_percentage_set=[0.2]     13.73 ± 12%
    RandWriteRandRead/zblk=auto/arrsize=1000000/change_count=500/change_percentage_set=[0.2]      13.60 ±  3%
    RandWriteRandRead/zblk=ZBlk0/arrsize=1000000/change_count=500/change_percentage_set=[1]       2.125 ±  2%
    RandWriteRandRead/zblk=ZBlk1/arrsize=1000000/change_count=500/change_percentage_set=[1]       14.18 ±  3%
    RandWriteRandRead/zblk=auto/arrsize=1000000/change_count=500/change_percentage_set=[1]        14.17 ±  1%
    RandWriteRandRead/zblk=ZBlk0/arrsize=1000000/change_count=500/change_percentage_set=[0.2,1]   2.118 ±  1%
    RandWriteRandRead/zblk=ZBlk1/arrsize=1000000/change_count=500/change_percentage_set=[0.2,1]   13.85 ±  2%
    RandWriteRandRead/zblk=auto/arrsize=1000000/change_count=500/change_percentage_set=[0.2,1]    13.80 ±  1%
    geomean                                                                                       6.423

See nexedi/wendelin.core!20 and
kirr/wendelin.core@da765ef7...0c6f0850 for the
preliminary history of this patch.
Co-authored-by: Kirill Smelkov <kirr@nexedi.com>

Fix typo.

d6628427

29 Mar, 2024 1 commit

lib/mem += memdelta · 84def52e

Kirill Smelkov authored Mar 29, 2024

This is utility function that we will need to use in the next patch to
see how data of two blocks are similar to each other.

We use numpy for the implementation because this code will be hot and if we
don't use optimized C routines writeout will become very slow.

Quoting draft patch kirr/wendelin.core@3f631932 :

    -> Also optimize ndelta computation - when done in plain python just
       this part was taking a lot of time as timing for initial writeup
       showed:

         writeup with ZBlk0: ~20-25s
         writeup with ZBlk1: ~20-30s
         writeup with auto:  was ~ 120s

       now, after switching to numpy for ndelta computation, whole runtime
       with 'auto' is taking ~ 35s. The whole runtime, if I observe
       benchmark execution correctly, is dominated by database writeup.

/reviewed-by @levin.zimmermann
/reviewed-on nexedi/wendelin.core!20

84def52e

11 Dec, 2023 2 commits

fixup! wcfs: v↑ go dependencies · da765ef7

Kirill Smelkov authored Dec 12, 2023

3636242f does not talk about go-fuse, which is wcfs's direct dependency
and was actually updated by upstream:

kirr/go-fuse@9f9ad4a1

-> Update it as well.

da765ef7

wcfs: v↑ go dependencies · 3636242f

Levin Zimmermann authored Aug 17, 2023

Update all dependencies of WCFS to their recent versions:

- neo/go: Update to pick up support for NEO/go to handle multiple master nodes
- go123: Add support for Go1.21

The following dependencies were updated, but depend on higher go
versions than supported by wendelin.core

- glog: v1.1.0 needs go 1.19, but we still support go 1.18

The following dependencies were not updated by upstream at all:

- overflow
- og-rek
- errors
- testify

/reviewed-by @kirr
/reviewed-on nexedi/wendelin.core!22

3636242f

01 Aug, 2023 1 commit

wcfs: v↑ go dependencies · 885b3556

Kirill Smelkov authored Jun 14, 2023

Update all dependencies of WCFS to their recent versions:

- go-fuse: update to pick up https://github.com/hanwen/go-fuse/commit/265a3926
  and https://github.com/hanwen/go-fuse/commit/90b055af. The first patch
  potentially improves performance, while the second fixes support for
  neo:// with multiple masters.
- go123: add support for go1.20
- testify: some bugfixes (we use this package only during tests)

The following dependencies were not updated by upstream at all:

- glog
- overflow
- ogórek
- pkg/errors

.

/reviewed-by @levin.zimmermann
/reviewed-on nexedi/wendelin.core!16

885b3556

30 Jul, 2023 5 commits

lib/zodb: Insure NEO with > 1 master always normalizes to same URI · c04e95f9

Levin Zimmermann authored Jun 23, 2023

If a NEO cluster has multiple master nodes, there is no agreed
on order in which the master node addresses appear in the URI.
In order to insure we always get the same normalized URI among different
clients of a NEO cluster with more than one master node, we explicitly
sort the master node address order with this patch.

/reviewed-by @kirr
/reviewed-on nexedi/wendelin.core!17

c04e95f9

lib/zodb/zurl_normalize_main += explicit filtering depending on scheme · f5275f82

Levin Zimmermann authored Jun 23, 2023

In the old source code we already filtered NEO URI by dropping
credentials, but we applied this filter to any URI, not only the NEO
one. This patch adds a mechanism to apply various filter according to
the specific storage type. Starting with this new patch,
'zurl_normalize_main' also refuses to normalize an URI with an unknown
scheme.

/reviewed-by @kirr
/reviewed-on nexedi/wendelin.core!17

f5275f82

qa: lib/tests/zodb += zurl_normalize_main · 6032b274

Levin Zimmermann authored Jun 23, 2023

After moving zurl filtering to a dedicated function, we can
now test this function for correctness. It's important that different
clients which point to the same storage always result in the same
zodburi, even if their initial user-specified zodburi slightly differs
(e.g. due to different client-side parameters or different paths of encryption).

/reviewed-by @kirr
/reviewed-on nexedi/wendelin.core!17

6032b274

wcfs: Move zuri filter to lib/zodb · ae54c563

Levin Zimmermann authored Jun 22, 2023

The WCFS mountpoint of any ZODB storage must be a unique, persistent,
repeatable hash. This means any client which uses the same storage must
always calculate the same WCFS mountpoint (independent from
client-only parameters etc.). Therefore the WCFS mountpoint calculation
must be robust for all supported ZODB storage types (at least NEO, ZEO,
filestorage).

It was recently decided [1] that in order to provide this robustness, WCFS
mountpoint calculation should filter the parsed URI in order to drop
parts, which prevents the repeatability/persistence across different
clients (e.g. parts which can differ between clients although the same
storage is accessed). In order to make this filtering implementation a
bit easier to read and the wcfs/__init__.py less dense, the first step
is to move the zurl filtering ("normalization") into lib/zodb.py
This also makes sense since this normalization can be regarded as a
general zodb tool which may be useful for other solutions which use
zodburi.

[1] nexedi/neoppod!18 (comment 184671)

/reviewed-by @kirr
/reviewed-on nexedi/wendelin.core!17

ae54c563

lib/tests: Fix flaky zstor_2zurl test · cc33d610

Levin Zimmermann authored Jul 28, 2023

Kirill noted that nexedi/wendelin.core@fb620301 introduced a regression [1]:
'test_zstor_2zurl' sometimes passes and sometimes fails. The reason for
this is that there is no deterministic order of master nodes in
'NodeManager.getMasterList()', which is why there is no specified
order of master node addresses in a zurl [2]. We don't want to normalize
a zurl returned by 'zstor_2zurl' as we need some of the client-specific
parameters as SSL file paths, so we rather fix the test to allow any
possible order of NEO master nodes in the zurl.

[1] nexedi/wendelin.core!17 (comment 188102)
[2] https://lab.nexedi.com/nexedi/wendelin.core/blob/fb620301/lib/zodb.py#L414

/reviewed-by @kirr
/reviewed-on nexedi/wendelin.core!17

cc33d610

19 Jun, 2023 1 commit

lib/zodb/zstor_2zurl/NEO: support > 1 master nodes · fb620301

Levin Zimmermann authored Jun 14, 2023

The old code raised an explicit exception when converting a NEO storage
with > 1 master nodes into a URI. Perhaps the rationale for this exception
was that there isn't any agreed on order of master nodes in a NEO URI,
which means that building a URI from such a storage could potentially
break the invariant that any client which points to the same storage
should result in the same WCFS mountpoint.
With levin.zimmermann/wendelin.core@6f5196fa we can now rely on
WCFS mountpoint calculation to always return the same mountpoint even if
the order of master node addresses differ. Therefore we can drop this
exception and allow WCFS to support NEO clusters with more than one master.

--------

kirr: support for multiple masters was simply not implemented because in
a05db040 (lib/zodb: Teach zstor_2zurl about ZEO, NEO and Demo storages)
I though that we do not yet actually need it and wanted to have
something minimal first.

I agree that in WCFS context it is ok and makes sense to normalize zurl
to have masters coming in particular order. But at zstor_2zurl level we
rely on the order of masters that app.nm.getMasterList gives us. The
normalization is separate function.

/reviewed-by @kirr
/reviewed-on nexedi/wendelin.core!17

fb620301