1. 24 Jun, 2024 4 commits
    • Carlos Ramos Carreño's avatar
      Transform tree data to `bytes`. · fadc5a07
      Carlos Ramos Carreño authored
      The data in the tree test cannot be automatically converted to `bytes`
      as a encoding is necessary in that case:
      
      ```python
      Traceback (most recent call last):
        File "/srv/slapgrid/slappart66/git/pygolang/golang/__init__.py", line 125, in _
          return f(*argv, **kw)
        File "/srv/slapgrid/slappart66/git/wendelin.core/wcfs/internal/xbtree/xbtreetest/treegen.py", line 281, in TreesSrv
          zblk.setblkdata(v2)
        File "/srv/slapgrid/slappart66/git/wendelin.core/bigfile/file_zodb.py", line 294, in setblkdata
          blkdata = bytes(buf)                    # FIXME does memcpy
      TypeError: string argument without an encoding
      ```
      
      As we know that the data is ASCII, we use the ASCII encoder to convert
      it to `bytes`.
      fadc5a07
    • Carlos Ramos Carreño's avatar
      Parse `bytes` as str in Golang. · 5a2b639c
      Carlos Ramos Carreño authored
      [In Go, strings can contain arbitrary bytes](https://go.dev/blog/strings)
      (similar to the ones in Python 2), and there is no explicit `bytes`
      class.
      Thus, all the code that deals with arbitrary data in Go is currently
      using the string type.
      
      However, the [ogórek package](https://github.com/kisielk/og-rek) used
      to load Python pickles in Go, decodes Python 3 `bytes` objects as a
      special string type `ogórek.Bytes`, incompatible with APIs expecting
      normal strings.
      This causes errors such as the following:
      ```
      --- FAIL: TestΔFtail (1.31s)
      panic: @03f97a3b982b7bcc: get blkdata from obj 0000000000000002: ZBlk0(0000000000000002): loadBlkData: wendelin.bigfile.file_zodb.ZBlk0(0000000000000002): activate: pysetstate: expect str; got ogórek.Bytes [recovered]
              panic: file:///tmp/TestΔFtail493253177/001/1.fs: @03f97a3b982b7bcc: get blktab: @03f97a3b982b7bcc: get blkdata from obj 0000000000000002: ZBlk0(0000000000000002): loadBlkData: wendelin.bigfile.file_zodb.ZBlk0(0000000000000002): activate: pysetstate: expect str; got ogórek.Bytes [recovered]
              panic: file:///tmp/TestΔFtail493253177/001/1.fs: @03f97a3b982b7bcc: get blktab: @03f97a3b982b7bcc: get blkdata from obj 0000000000000002: ZBlk0(0000000000000002): loadBlkData: wendelin.bigfile.file_zodb.ZBlk0(0000000000000002): activate: pysetstate: expect str; got ogórek.Bytes
      ```
      
      In order to fix that, we have copied the [`Xstrbytes` function from NEO](
      https://lab.nexedi.com/kirr/neo/-/blob/f7776fc1689b0d62e582b132ecc017ab72dc3b23/go/zodb/internal/pickletools/pickletools.go#L49-64)
      in the `pycompat.go` file, which accepts either a string or an
      `ogórek.Bytes` object and returns the corresponding string.
      
      We have used that function whenever a `bytes` object could be present.
      5a2b639c
    • Carlos Ramos Carreño's avatar
      Iterate the example data as a string. · eca099bc
      Carlos Ramos Carreño authored
      The example data ('abcdefghij') should be iterated as a string, and not
      as `bytes`.
      Iterating a `bytes` object in Python 3 returns `int`, not `bytes`,
      giving the following error on decoding:
      
      ```
      --- FAIL: TestΔBTail (1.26s)
      panic: root['treegen/values']: key ['b']: expected str, got int64 [recovered]
              panic: file:///tmp/TestΔBTail2213686135/001/1.fs: @03f97a31f270e722: get blktab: root['treegen/values']: key ['b']: expected str, got int64 [recovered]
              panic: file:///tmp/TestΔBTail2213686135/001/1.fs: @03f97a31f270e722: get blktab: root['treegen/values']: key ['b']: expected str, got int64
      ```
      
      Instead, we can iterate as a string and encode each character to `bytes`
      afterwards.
      eca099bc
    • Carlos Ramos Carreño's avatar
      Execute `zip` eagerly. · b47294e1
      Carlos Ramos Carreño authored
      The builtin `zip` in Python 3 returns an iterator, not a list.
      Thus, one cannot directly use the `len` method on the object returned
      by `zip`, or we will have errors like the following one:
      
      ```python
      Traceback (most recent call last):
        File "/srv/slapgrid/slappart66/git/wendelin.core/wcfs/internal/xbtree/xbtreetest/treegen.py", line 617, in <module>
          main()
        File "/srv/slapgrid/slappart66/git/wendelin.core/wcfs/internal/xbtree/xbtreetest/treegen.py", line 613, in main
          cmd(argv)
        File "/srv/slapgrid/slappart66/venvs/wendelin.core/lib/python3.9/site-packages/decorator.py", line 232, in fun
          return caller(func, *(extras + args), **kw)
        File "/srv/slapgrid/slappart66/git/pygolang/golang/__init__.py", line 125, in _
          return f(*argv, **kw)
        File "/srv/slapgrid/slappart66/git/wendelin.core/wcfs/internal/xbtree/xbtreetest/treegen.py", line 589, in cmd_trees
          TreesSrv(zstor, r)
        File "/srv/slapgrid/slappart66/venvs/wendelin.core/lib/python3.9/site-packages/decorator.py", line 232, in fun
          return caller(func, *(extras + args), **kw)
        File "/srv/slapgrid/slappart66/git/pygolang/golang/__init__.py", line 125, in _
          return f(*argv, **kw)
        File "/srv/slapgrid/slappart66/git/wendelin.core/wcfs/internal/xbtree/xbtreetest/treegen.py", line 234, in TreesSrv
          treetxtPrev = zctx.ztreetxt(ztree)
        File "/srv/slapgrid/slappart66/venvs/wendelin.core/lib/python3.9/site-packages/decorator.py", line 232, in fun
          return caller(func, *(extras + args), **kw)
        File "/srv/slapgrid/slappart66/git/pygolang/golang/__init__.py", line 125, in _
          return f(*argv, **kw)
        File "/srv/slapgrid/slappart66/git/wendelin.core/wcfs/internal/xbtree/xbtreetest/treegen.py", line 536, in ztreetxt
          return zctx.TopoEncode(xbtree.StructureOf(ztree))
        File "/srv/slapgrid/slappart66/venvs/wendelin.core/lib/python3.9/site-packages/decorator.py", line 232, in fun
          return caller(func, *(extras + args), **kw)
        File "/srv/slapgrid/slappart66/git/pygolang/golang/__init__.py", line 125, in _
          return f(*argv, **kw)
        File "/srv/slapgrid/slappart66/git/wendelin.core/wcfs/internal/xbtree/xbtreetest/treegen.py", line 542, in TopoEncode
          return xbtree.TopoEncode(tree, zctx.vencode)
        File "/srv/slapgrid/slappart66/git/wendelin.core/wcfs/internal/xbtree.py", line 797, in TopoEncode
          for nodev in _walkBFS(tree):
        File "/srv/slapgrid/slappart66/git/wendelin.core/wcfs/internal/xbtree.py", line 701, in _walkBFS
          for level in __walkBFS(tree):
        File "/srv/slapgrid/slappart66/git/wendelin.core/wcfs/internal/xbtree.py", line 724, in __walkBFS
          assert len(rv) == len(rn.node.children)
      TypeError: object of type 'zip' has no len()
      ```
      
      Thus, we have to create a list from the result of `zip` before calling
      `len` on it.
      b47294e1
  2. 21 Jun, 2024 1 commit
    • Kirill Smelkov's avatar
      *: Do not use relative imports · 3846997b
      Kirill Smelkov authored
      Because of the way wendelin.core organizes its in-tree python importing
      redirector (see wendelin.py) it is possible to import the same module
      twice with python thinking it is importing two different modules. For
      example when installed in develop mode python resolves the following
      imports to the same bigfile/__init__.py
      
          import wendelin.bigfile
          import bigfile
      
      but tries to load that module twice and independently. Which leads to
      virtmem DSO, linked to from under bigfile/_bigfile extension, being
      initialized twice and complaining about that because only single gil
      hook should be requested to be installed:
      
          (py39.venv) kirr@deca:~/src/wendelin/wendelin.core$ python
          Python 3.9.19+ (heads/3.9:40d77b93672, Apr 12 2024, 06:40:05)
          [GCC 12.2.0] on linux
          Type "help", "copyright", "credits" or "license" for more information.
          >>> import wendelin.bigfile
          >>> import bigfile
          python: bigfile/virtmem.c:106: virt_lock_hookgil: Assertion `!(virtmem_gilhooks)' failed.
          Аварийный останов
      
      This problem was there from day 1, but it was not creating issues in
      practice because wendelin.core users do `import wendelin...` and there
      was also no problem with running pytest in the source tree.
      
      However with py39 and pytest8 we see that running pytest somehow started
      to unconditionally import things from under two namespaces which leads
      to inability to run tests even when instructing pytest to collect them
      via python-modules namespace instead of filesystem:
      
          (py39.venv) kirr@deca:~/src/wendelin/wendelin.core$ pytest -vsx --pyargs wendelin.bigfile.tests.test_basic
          ======================== test session starts ========================
          platform linux -- Python 3.9.19+, pytest-8.2.2, pluggy-1.5.0 -- /home/kirr/src/wendelin/venv/py39.venv/bin/python3.9
          cachedir: .pytest_cache
          rootdir: /home/kirr/src/wendelin/wendelin.core
          configfile: pyproject.toml
          collecting ... python3.9: bigfile/virtmem.c:106: virt_lock_hookgil: Assertion `!(virtmem_gilhooks)' failed.
          Fatal Python error: Aborted
      
          Current thread 0x00007fa172a60740 (most recent call first):
            File "<frozen importlib._bootstrap>", line 228 in _call_with_frames_removed
            File "<frozen importlib._bootstrap_external>", line 1173 in create_module
            File "<frozen importlib._bootstrap>", line 565 in module_from_spec
            File "<frozen importlib._bootstrap>", line 666 in _load_unlocked
            File "<frozen importlib._bootstrap>", line 986 in _find_and_load_unlocked
            File "<frozen importlib._bootstrap>", line 1007 in _find_and_load
            File "/home/kirr/src/wendelin/wendelin.core/bigfile/__init__.py", line 31 in <module>
            File "<frozen importlib._bootstrap>", line 228 in _call_with_frames_removed
            File "<frozen importlib._bootstrap_external>", line 850 in exec_module
            File "<frozen importlib._bootstrap>", line 680 in _load_unlocked
            File "<frozen importlib._bootstrap>", line 986 in _find_and_load_unlocked
            File "<frozen importlib._bootstrap>", line 1007 in _find_and_load
            File "<frozen importlib._bootstrap>", line 1030 in _gcd_import
            File "<frozen importlib._bootstrap>", line 228 in _call_with_frames_removed
            File "<frozen importlib._bootstrap>", line 972 in _find_and_load_unlocked
            File "<frozen importlib._bootstrap>", line 1007 in _find_and_load
            File "<frozen importlib._bootstrap>", line 1030 in _gcd_import
            File "<frozen importlib._bootstrap>", line 228 in _call_with_frames_removed
            File "<frozen importlib._bootstrap>", line 972 in _find_and_load_unlocked
            File "<frozen importlib._bootstrap>", line 1007 in _find_and_load
            File "<frozen importlib._bootstrap>", line 1030 in _gcd_import
            File "/home/kirr/local/py3.9/lib/python3.9/importlib/__init__.py", line 127 in import_module
            File "/home/kirr/src/wendelin/venv/py39.venv/lib/python3.9/site-packages/_pytest/pathlib.py", line 591 in import_path
            File "/home/kirr/src/wendelin/venv/py39.venv/lib/python3.9/site-packages/_pytest/python.py", line 492 in importtestmodule
            ...
            File "/home/kirr/src/wendelin/venv/py39.venv/lib/python3.9/site-packages/_pytest/runner.py", line 567 in collect_one_node
            File "/home/kirr/src/wendelin/venv/py39.venv/lib/python3.9/site-packages/_pytest/main.py", line 837 in _collect_one_node
            File "/home/kirr/src/wendelin/venv/py39.venv/lib/python3.9/site-packages/_pytest/main.py", line 974 in genitems
            File "/home/kirr/src/wendelin/venv/py39.venv/lib/python3.9/site-packages/_pytest/main.py", line 811 in perform_collect
            File "/home/kirr/src/wendelin/venv/py39.venv/lib/python3.9/site-packages/_pytest/main.py", line 349 in pytest_collection
            ...
            File "/home/kirr/src/wendelin/venv/py39.venv/lib/python3.9/site-packages/_pytest/config/__init__.py", line 178 in main
            File "/home/kirr/src/wendelin/venv/py39.venv/lib/python3.9/site-packages/_pytest/config/__init__.py", line 206 in console_main
            File "/home/kirr/src/wendelin/venv/py39.venv/bin/pytest", line 8 in <module>
          Аварийный останов (образ памяти сброшен на диск)
      
      This happens because wendelin.bigfile is importing
      wendelin.bigfile._bigfile as `from ._bigfile import ...` which under
      pytest leads to importing both wendelin.bigfile._bigfile and
      bigfile._bigfile and further conflicting when setting up GIL hooks.
      
      -> Fix this issue by avoiding relative imports and always referring to
         wendelin.core modules with `wendelin.` prefix.
      
      The list of places where relative imports were used was small and found via
      
          $ git grep -w import |grep '\s\.'
          bigfile/__init__.py:from ._bigfile import BigFile, WRITEOUT_STORE, WRITEOUT_MARKSTORED, ram_reclaim
          wcfs/__init__.py:from .client._wcfs import \
      
      Everywhere else we were already importing things from under wendelin
      namespace via fully specified module path.
      
      After the fix both
      
          $ pytest -vsx --pyargs wendelin.bigfile.tests.test_basic
      
      and
      
          $ pytest -vsx bigfile/tests/test_basic.py
      
      start to work ok from inside the worktree.
      
      /reported-and-tested-by @vnmabus
      /reviewed-by @levin.zimmermann
      /reviewed-on nexedi/wendelin.core!26
      3846997b
  3. 07 Jun, 2024 1 commit
    • Carlos Ramos Carreño's avatar
      setup: Allow editable wheels. · 37cf1383
      Carlos Ramos Carreño authored
      When building an editable wheel it is not necessary that
      `build_packages` (or even `run`) is called before calling `get_outputs`
      (notice the following in
      https://setuptools.pypa.io/en/latest/userguide/extension.html#supporting-sdists-and-editable-installs-in-build-sub-commands :
      "Please note that custom sub-commands SHOULD NOT rely on `run()` being
      executed (or not) to provide correct return values for `get_outputs()`,
      `get_output_mapping()` or `get_source_files()`. The `get_*` methods
      should work independently of `run().").
      
      Our implementation relied in the call to `build_packages` to set the
      name of the synthetic init file.
      This commit uses a property of the object instead, to compute that name
      whenever it is necessary.
      With this change, it is now possible to make editable wheels.
      
      --------
      kirr:
      
      Without the fix `pip install -e` fails as follows on py3:
      
        Traceback (most recent call last):
          File "/home/kirr/src/wendelin/venv/py39.venv/lib/python3.9/site-packages/setuptools/command/editable_wheel.py", line 155, in run
            self._create_wheel_file(bdist_wheel)
          File "/home/kirr/src/wendelin/venv/py39.venv/lib/python3.9/site-packages/setuptools/command/editable_wheel.py", line 357, in _create_wheel_file
            files, mapping = self._run_build_commands(dist_name, unpacked, lib, tmp)
          File "/home/kirr/src/wendelin/venv/py39.venv/lib/python3.9/site-packages/setuptools/command/editable_wheel.py", line 281, in _run_build_commands
            files, mapping = self._collect_build_outputs()
          File "/home/kirr/src/wendelin/venv/py39.venv/lib/python3.9/site-packages/setuptools/command/editable_wheel.py", line 266, in _collect_build_outputs
            files.extend(cmd.get_outputs() or [])
          File "<string>", line 137, in get_outputs
          File "/home/kirr/src/wendelin/venv/py39.venv/lib/python3.9/site-packages/setuptools/command/build_py.py", line 78, in __getattr__
            return orig.build_py.__getattr__(self, attr)
          File "/home/kirr/src/wendelin/venv/py39.venv/lib/python3.9/site-packages/setuptools/_distutils/cmd.py", line 107, in __getattr__
            raise AttributeError(attr)
        AttributeError: initfile
      
      /reviewed-by @kirr
      /reviewed-on nexedi/wendelin.core!25
      37cf1383
  4. 31 May, 2024 2 commits
  5. 03 Apr, 2024 2 commits
    • Levin Zimmermann's avatar
      bigfile/zodb: Make auto format the default · 99f262dd
      Levin Zimmermann authored
      If a user doesn't explicitly declare a ZBlk format, it can be assumed
      that this user wants to have the best ratio between consumed storage space
      and data access speed. Currently the best ratio between these two is
      provided by the new 'auto' (heuristic) format. In case of small appends this
      format helps reducing storage space, and in any other case it just
      behaves like ZBlk0 [1]. Therefore this default ensures a fast access speed [2],
      but also avoids a massive data growth in case of many small appends [3].
      
      [1] An exception to this is: in its current implementation a block
      behaves like ZBlk1 (slow access) in case it isn't fully filled up yet.
      
      [2] As this was stated as a reason why ZBlk1 as a default format was
      reverted in nexedi/wendelin.core@0b68f178.
      
      [3] This was perhaps the reason why ZBlk1 was set to be the default format
      in nexedi/wendelin.core@9ae42085. The massive
      storage space consumption can already be a problem with few array to
      which regularly small data is appended to, as it can easily happen with
      Wendelin development instances.
      
      /reviewed-by @kirr
      /reviewed-on nexedi/wendelin.core!20
      99f262dd
    • Levin Zimmermann's avatar
      bigfile/zodb: Add ZBlk format option 'auto' (heuristic) · d6628427
      Levin Zimmermann authored
      There are two formats to save data with a ZBigFile: ZBlk0 and ZBlk1.
      They differ by adjusting the ratio between access-time and growing
      disk-space, where ZBlk1 is better regarding to disk space, while ZBlk0
      has a better access-time. Wendelin.core users may not always know yet or
      care which format fits better for their data. In this case it may be
      easier for users to just let the program automatically select the ZBlk
      format. With this patch and the new 'auto' (for heuristic) option of the
      'ZBlk' argument of ZBigFile, this is now possible. The 'auto' option isn't
      really a new ZBlk format in itself, but it just tries to automatically
      select the best ZBlk format option according to the characteristics
      of the changes that the user applies to the ZBigFile.
      
      In its current implementation, the heuristic tackles the use-case of
      large arrays with many small append-only changes. In this case 'auto' is
      smaller in space than ZBlk0, but faster to read than ZBlk1. It does so,
      by initially using ZBlk1 until a blk is filled up. Once a blk is full,
      it switches to ZBlk1, as it was recommended by @kirr in
      nexedi/wendelin.core!20 (comment 196084).
      
      With this patch comes a test (bigfile/tests/bench_zblkfmt) that creates
      benchmarks for different combinations and zblk formats. The test aims
      to check how the 'heuristic' format performs in contrast to 'ZBlk0'
      and 'ZBlk1':
      
          BenchmarkAppendSize/zblk=ZBlk0/change_count=500/change_percentage_set=[0.014]   1       538.1 MB
          BenchmarkAppendRandRead/zblk=ZBlk0/change_count=500/change_percentage_set=[0.014]       6 2.085 ms/blk
          BenchmarkAppendSize/zblk=ZBlk1/change_count=500/change_percentage_set=[0.014]   1       16.8 MB
          BenchmarkAppendRandRead/zblk=ZBlk1/change_count=500/change_percentage_set=[0.014]       6 14.564 ms/blk
          BenchmarkAppendSize/zblk=auto/change_count=500/change_percentage_set=[0.014]    1       29.4 MB
          BenchmarkAppendRandRead/zblk=auto/change_count=500/change_percentage_set=[0.014]        6 2.119 ms/blk
          BenchmarkRandWriteSize/zblk=ZBlk0/arrsize=1000000/change_count=500/change_percentage_set=[0.2]  1       1021.1 MB
          BenchmarkRandWriteRandRead/zblk=ZBlk0/arrsize=1000000/change_count=500/change_percentage_set=[0.2]      3 2.324 ms/blk
          BenchmarkRandWriteSize/zblk=ZBlk1/arrsize=1000000/change_count=500/change_percentage_set=[0.2]  1       216.2 MB
          BenchmarkRandWriteRandRead/zblk=ZBlk1/arrsize=1000000/change_count=500/change_percentage_set=[0.2]      3 15.317 ms/blk
          BenchmarkRandWriteSize/zblk=auto/arrsize=1000000/change_count=500/change_percentage_set=[0.2]   1       219.8 MB
          BenchmarkRandWriteRandRead/zblk=auto/arrsize=1000000/change_count=500/change_percentage_set=[0.2]       3 14.027 ms/blk
          BenchmarkRandWriteSize/zblk=ZBlk0/arrsize=1000000/change_count=500/change_percentage_set=[1]    1       1048.6 MB
          BenchmarkRandWriteRandRead/zblk=ZBlk0/arrsize=1000000/change_count=500/change_percentage_set=[1]        3 2.126 ms/blk
          BenchmarkRandWriteSize/zblk=ZBlk1/arrsize=1000000/change_count=500/change_percentage_set=[1]    1       1070.4 MB
          BenchmarkRandWriteRandRead/zblk=ZBlk1/arrsize=1000000/change_count=500/change_percentage_set=[1]        3 14.284 ms/blk
          BenchmarkRandWriteSize/zblk=auto/arrsize=1000000/change_count=500/change_percentage_set=[1]     1       1070.3 MB
          BenchmarkRandWriteRandRead/zblk=auto/arrsize=1000000/change_count=500/change_percentage_set=[1] 3 14.072 ms/blk
          BenchmarkRandWriteSize/zblk=ZBlk0/arrsize=1000000/change_count=500/change_percentage_set=[0.2,1]        1       1046.4 MB
          BenchmarkRandWriteRandRead/zblk=ZBlk0/arrsize=1000000/change_count=500/change_percentage_set=[0.2,1]    3 2.137 ms/blk
          BenchmarkRandWriteSize/zblk=ZBlk1/arrsize=1000000/change_count=500/change_percentage_set=[0.2,1]        1       638.2 MB
          BenchmarkRandWriteRandRead/zblk=ZBlk1/arrsize=1000000/change_count=500/change_percentage_set=[0.2,1]    3 14.083 ms/blk
          BenchmarkRandWriteSize/zblk=auto/arrsize=1000000/change_count=500/change_percentage_set=[0.2,1] 1       639.5 MB
          BenchmarkRandWriteRandRead/zblk=auto/arrsize=1000000/change_count=500/change_percentage_set=[0.2,1]     3 13.937 ms/blk
      
      and post-processed with benchstat from 3 such runs:
      
                                                                                                  │     x.log     │
                                                                                                  │       B       │
          AppendSize/zblk=ZBlk0/change_count=500/change_percentage_set=[0.014]                       513.2Mi ± 0%
          AppendSize/zblk=ZBlk1/change_count=500/change_percentage_set=[0.014]                       16.02Mi ± 0%
          AppendSize/zblk=auto/change_count=500/change_percentage_set=[0.014]                        28.04Mi ± 0%
          RandWriteSize/zblk=ZBlk0/arrsize=1000000/change_count=500/change_percentage_set=[0.2]      973.8Mi ± 0%
          RandWriteSize/zblk=ZBlk1/arrsize=1000000/change_count=500/change_percentage_set=[0.2]      206.2Mi ± 0%
          RandWriteSize/zblk=auto/arrsize=1000000/change_count=500/change_percentage_set=[0.2]       209.6Mi ± 0%
          RandWriteSize/zblk=ZBlk0/arrsize=1000000/change_count=500/change_percentage_set=[1]       1000.0Mi ± 0%
          RandWriteSize/zblk=ZBlk1/arrsize=1000000/change_count=500/change_percentage_set=[1]       1020.8Mi ± 0%
          RandWriteSize/zblk=auto/arrsize=1000000/change_count=500/change_percentage_set=[1]        1020.7Mi ± 0%
          RandWriteSize/zblk=ZBlk0/arrsize=1000000/change_count=500/change_percentage_set=[0.2,1]    997.9Mi ± 0%
          RandWriteSize/zblk=ZBlk1/arrsize=1000000/change_count=500/change_percentage_set=[0.2,1]    608.6Mi ± 0%
          RandWriteSize/zblk=auto/arrsize=1000000/change_count=500/change_percentage_set=[0.2,1]     609.9Mi ± 0%
          geomean                                                                                    353.0Mi
      
                                                                                                      │    x.log    │
                                                                                                      │   ms/blk    │
          AppendRandRead/zblk=ZBlk0/change_count=500/change_percentage_set=[0.014]                      2.094 ± 12%
          AppendRandRead/zblk=ZBlk1/change_count=500/change_percentage_set=[0.014]                      14.47 ±  1%
          AppendRandRead/zblk=auto/change_count=500/change_percentage_set=[0.014]                       2.168 ±  2%
          RandWriteRandRead/zblk=ZBlk0/arrsize=1000000/change_count=500/change_percentage_set=[0.2]     2.324 ±  1%
          RandWriteRandRead/zblk=ZBlk1/arrsize=1000000/change_count=500/change_percentage_set=[0.2]     13.73 ± 12%
          RandWriteRandRead/zblk=auto/arrsize=1000000/change_count=500/change_percentage_set=[0.2]      13.60 ±  3%
          RandWriteRandRead/zblk=ZBlk0/arrsize=1000000/change_count=500/change_percentage_set=[1]       2.125 ±  2%
          RandWriteRandRead/zblk=ZBlk1/arrsize=1000000/change_count=500/change_percentage_set=[1]       14.18 ±  3%
          RandWriteRandRead/zblk=auto/arrsize=1000000/change_count=500/change_percentage_set=[1]        14.17 ±  1%
          RandWriteRandRead/zblk=ZBlk0/arrsize=1000000/change_count=500/change_percentage_set=[0.2,1]   2.118 ±  1%
          RandWriteRandRead/zblk=ZBlk1/arrsize=1000000/change_count=500/change_percentage_set=[0.2,1]   13.85 ±  2%
          RandWriteRandRead/zblk=auto/arrsize=1000000/change_count=500/change_percentage_set=[0.2,1]    13.80 ±  1%
          geomean                                                                                       6.423
      
      See nexedi/wendelin.core!20 and
      da765ef7...0c6f0850 for the
      preliminary history of this patch.
      Co-authored-by: Kirill Smelkov's avatarKirill Smelkov <kirr@nexedi.com>
      
      Fix typo.
      d6628427
  6. 29 Mar, 2024 1 commit
    • Kirill Smelkov's avatar
      lib/mem += memdelta · 84def52e
      Kirill Smelkov authored
      This is utility function that we will need to use in the next patch to
      see how data of two blocks are similar to each other.
      
      We use numpy for the implementation because this code will be hot and if we
      don't use optimized C routines writeout will become very slow.
      
      Quoting draft patch kirr/wendelin.core@3f631932 :
      
          -> Also optimize ndelta computation - when done in plain python just
             this part was taking a lot of time as timing for initial writeup
             showed:
      
               writeup with ZBlk0: ~20-25s
               writeup with ZBlk1: ~20-30s
               writeup with auto:  was ~ 120s
      
             now, after switching to numpy for ndelta computation, whole runtime
             with 'auto' is taking ~ 35s. The whole runtime, if I observe
             benchmark execution correctly, is dominated by database writeup.
      
      /reviewed-by @levin.zimmermann
      /reviewed-on nexedi/wendelin.core!20
      84def52e
  7. 11 Dec, 2023 2 commits
    • Kirill Smelkov's avatar
      fixup! wcfs: v↑ go dependencies · da765ef7
      Kirill Smelkov authored
      3636242f does not talk about go-fuse, which is wcfs's direct dependency
      and was actually updated by upstream:
      
      kirr/go-fuse@9f9ad4a1
      
      -> Update it as well.
      da765ef7
    • Levin Zimmermann's avatar
      wcfs: v↑ go dependencies · 3636242f
      Levin Zimmermann authored
      Update all dependencies of WCFS to their recent versions:
      
      - neo/go: Update to pick up support for NEO/go to handle multiple master nodes
      - go123: Add support for Go1.21
      
      The following dependencies were updated, but depend on higher go
      versions than supported by wendelin.core
      
      - glog: v1.1.0 needs go 1.19, but we still support go 1.18
      
      The following dependencies were not updated by upstream at all:
      
      - overflow
      - og-rek
      - errors
      - testify
      
      /reviewed-by @kirr
      /reviewed-on nexedi/wendelin.core!22
      3636242f
  8. 01 Aug, 2023 1 commit
  9. 30 Jul, 2023 5 commits
    • Levin Zimmermann's avatar
      lib/zodb: Insure NEO with > 1 master always normalizes to same URI · c04e95f9
      Levin Zimmermann authored
      If a NEO cluster has multiple master nodes, there is no agreed
      on order in which the master node addresses appear in the URI.
      In order to insure we always get the same normalized URI among different
      clients of a NEO cluster with more than one master node, we explicitly
      sort the master node address order with this patch.
      
      /reviewed-by @kirr
      /reviewed-on nexedi/wendelin.core!17
      c04e95f9
    • Levin Zimmermann's avatar
      lib/zodb/zurl_normalize_main += explicit filtering depending on scheme · f5275f82
      Levin Zimmermann authored
      In the old source code we already filtered NEO URI by dropping
      credentials, but we applied this filter to any URI, not only the NEO
      one. This patch adds a mechanism to apply various filter according to
      the specific storage type. Starting with this new patch,
      'zurl_normalize_main' also refuses to normalize an URI with an unknown
      scheme.
      
      /reviewed-by @kirr
      /reviewed-on !17
      f5275f82
    • Levin Zimmermann's avatar
      qa: lib/tests/zodb += zurl_normalize_main · 6032b274
      Levin Zimmermann authored
      After moving zurl filtering to a dedicated function, we can
      now test this function for correctness. It's important that different
      clients which point to the same storage always result in the same
      zodburi, even if their initial user-specified zodburi slightly differs
      (e.g. due to different client-side parameters or different paths of encryption).
      
      /reviewed-by @kirr
      /reviewed-on nexedi/wendelin.core!17
      6032b274
    • Levin Zimmermann's avatar
      wcfs: Move zuri filter to lib/zodb · ae54c563
      Levin Zimmermann authored
      The WCFS mountpoint of any ZODB storage must be a unique, persistent,
      repeatable hash. This means any client which uses the same storage must
      always calculate the same WCFS mountpoint (independent from
      client-only parameters etc.). Therefore the WCFS mountpoint calculation
      must be robust for all supported ZODB storage types (at least NEO, ZEO,
      filestorage).
      
      It was recently decided [1] that in order to provide this robustness, WCFS
      mountpoint calculation should filter the parsed URI in order to drop
      parts, which prevents the repeatability/persistence across different
      clients (e.g. parts which can differ between clients although the same
      storage is accessed). In order to make this filtering implementation a
      bit easier to read and the wcfs/__init__.py less dense, the first step
      is to move the zurl filtering ("normalization") into lib/zodb.py
      This also makes sense since this normalization can be regarded as a
      general zodb tool which may be useful for other solutions which use
      zodburi.
      
      [1] nexedi/neoppod!18 (comment 184671)
      
      /reviewed-by @kirr
      /reviewed-on nexedi/wendelin.core!17
      ae54c563
    • Levin Zimmermann's avatar
      lib/tests: Fix flaky zstor_2zurl test · cc33d610
      Levin Zimmermann authored
      Kirill noted that fb620301 introduced a regression [1]:
      'test_zstor_2zurl' sometimes passes and sometimes fails. The reason for
      this is that there is no deterministic order of master nodes in
      'NodeManager.getMasterList()', which is why there is no specified
      order of master node addresses in a zurl [2]. We don't want to normalize
      a zurl returned by 'zstor_2zurl' as we need some of the client-specific
      parameters as SSL file paths, so we rather fix the test to allow any
      possible order of NEO master nodes in the zurl.
      
      [1] !17 (comment 188102)
      [2] https://lab.nexedi.com/nexedi/wendelin.core/blob/fb620301/lib/zodb.py#L414
      
      /reviewed-by @kirr
      /reviewed-on !17
      cc33d610
  10. 19 Jun, 2023 1 commit
    • Levin Zimmermann's avatar
      lib/zodb/zstor_2zurl/NEO: support > 1 master nodes · fb620301
      Levin Zimmermann authored
      The old code raised an explicit exception when converting a NEO storage
      with > 1 master nodes into a URI. Perhaps the rationale for this exception
      was that there isn't any agreed on order of master nodes in a NEO URI,
      which means that building a URI from such a storage could potentially
      break the invariant that any client which points to the same storage
      should result in the same WCFS mountpoint.
      With levin.zimmermann/wendelin.core@6f5196fa we can now rely on
      WCFS mountpoint calculation to always return the same mountpoint even if
      the order of master node addresses differ. Therefore we can drop this
      exception and allow WCFS to support NEO clusters with more than one master.
      
      --------
      
      kirr: support for multiple masters was simply not implemented because in
      a05db040 (lib/zodb: Teach zstor_2zurl about ZEO, NEO and Demo storages)
      I though that we do not yet actually need it and wanted to have
      something minimal first.
      
      I agree that in WCFS context it is ok and makes sense to normalize zurl
      to have masters coming in particular order. But at zstor_2zurl level we
      rely on the order of masters that app.nm.getMasterList gives us. The
      normalization is separate function.
      
      /reviewed-by @kirr
      /reviewed-on nexedi/wendelin.core!17
      fb620301
  11. 21 Dec, 2022 3 commits
  12. 26 Nov, 2022 3 commits
    • Levin Zimmermann's avatar
      zstor_2zurl: Fix ipv6 host for NEO/ZEO + test fix · 20498b2f
      Levin Zimmermann authored
      This patch allows using WCFS with a NEO or ZEO storage which is
      reachable by a URL which contains an ipv6 host.
      
      Without this patch the following example doesn't work:
      
      >>> from wendelin.lib.zodb import dbopen
      >>> root = dbopen("neo://cluster-name@[::1]:2051")
      >>> # "abc" points to a ZBigArray
      >>> root["abc"][0]
      
      It doesn't work because the parser missed adding square brackets around
      ipv6 hosts, due to which unparsing the resulting URL resulted in a wrong
      interpretation where a port starts.
      
      This patch furthermore amends 'test_zstor_2zurl' to test ZEO and NEO
      storages with ipv6 hosts.
      
      ---
      
      /reviewed-by @kirr
      /reviewed-on !13
      20498b2f
    • Levin Zimmermann's avatar
      lib/zodb/zstor_2zurl: Add comprehensive tests · 0a09d51e
      Levin Zimmermann authored
      This patch adds comprehensive tests for 'wendelin.lib.zodb.zstor_2zurl'.
      Before this patch only one related test existed ('test_zurlstable').
      This test only lightly checked correct functionality of 'zstor_2zurl'.
      Therefore we added the new tests 'test_zstor_2zurl' and
      'test_zurlsamedb'.
      
      The new tests only cover existing functionality.
      
      ---
      
      Co-authored-by: kirr
      
      /reviewed-by @kirr
      /reviewed-on nexedi/wendelin.core!13
      0a09d51e
    • Levin Zimmermann's avatar
      test_zodb/zsync: Fix ZEO storage synchronization · 28a7db7f
      Levin Zimmermann authored
      Before this patch 'zsync(storage)' was effectless for ZEO storages, it
      didn't synchronize the client with the server. This patch fixes 'zsync',
      so that it also performs synchronization of ZEO clients.
      
      Background information:
      =======================
      
      2006 the sync mode of ZEO has been removed:
      
        ZEO@629b0667
      
      and only async mode was supported from then. This means, that the "sync"
      method of ZEO.ClientStorage was in fact effectless. In ZEO 5 the
      "server-sync" option has been added:
      
        https://github.com/zopefoundation/ZEO/pull/63
      
      Setting this option to 'True' makes the 'sync' method performing a
      "server round trip, thus causing client to wait for outstanding
      invalidations" [1]. In this patch we imitate the effect of this flag
      for both ZEO 4 and ZEO 5.
      
      [1] https://github.com/zopefoundation/ZEO/blob/423cb8563be3e1ee0bb4297ee980d9b74f09c710/src/ZEO/ClientStorage.py#L225-L226
      
      ---
      
      /reviewed-by @kirr
      /reviewed-on !13
      28a7db7f
  13. 10 Nov, 2022 1 commit
    • Levin Zimmermann's avatar
      BigArray: Fix API deviation with ndarray (shape) · adffe247
      Levin Zimmermann authored
      The 'shape' argument of 'numpy.ndarray's initialization method accepts
      integer and sequences of integers. But the 'shape' property of
      'numpy.ndarray' always returns tuple[int, ...], so numpy manually
      casts any legal argument into tuple[int, ...].
      
      In 'BigArray' and 'ZBigArray' this internal casting didn't exist yet.
      This patch adds the casting.
      
      Before:
      
        ZBigArray(shape=[1, 2, 3], dtype=float).shape == [1, 2, 3]
      
      After:
      
        ZBigArray(shape=[1, 2, 3], dtype=float).shape == (1, 2, 3)
      
      In this way BigArray and ZBigArray API behaves closer to numpy.ndaray,
      which should help avoiding confusion when people are using BigArray /
      ZBigArray.
      
      -----
      
      See issue nexedi/wendelin.core#9 and
      MR nexedi/wendelin.core!14
      for additional context.
      
      /reviewed-by @kirr
      /reviewed-on nexedi/wendelin.core!14
      adffe247
  14. 18 May, 2022 1 commit
    • Kirill Smelkov's avatar
      demo_zbigarray: Fix it for Python3 · 61dc1ff2
      Kirill Smelkov authored
      Wendelin.core already supports Python3 relatively well, but demo_zbigarray.py,
      that is invoked only manually, was missing compatibility bits for xrange:
      
          (neo) (py3.venv) (g.env) kirr@deca:~/src/neo/src/lab.nexedi.com/nexedi/wendelin.core$ ./demo/demo_zbigarray.py gen 1.fs
          I: RAM:  15.29GB
          I: WORK: 30.57GB
          gen signal t=0...4.10e+09  float64  (= 30.57GB)
          Traceback (most recent call last):
            File "/home/kirr/src/wendelin/wendelin.core/./demo/demo_zbigarray.py", line 154, in <module>
              main()
            File "/home/kirr/src/wendelin/venv/py3.venv/lib/python3.9/site-packages/decorator.py", line 232, in fun
              return caller(func, *(extras + args), **kw)
            File "/home/kirr/src/tools/go/pygolang/golang/__init__.py", line 103, in _
              return f(*argv, **kw)
            File "/home/kirr/src/wendelin/wendelin.core/./demo/demo_zbigarray.py", line 142, in main
              gen(sig)
            File "/home/kirr/src/wendelin/wendelin.core/./demo/demo_zbigarray.py", line 74, in gen
              for t0 in xrange(0, len(a), blocksize):
          NameError: name 'xrange' is not defined
      
      -> Fix it.
      61dc1ff2
  15. 02 Feb, 2022 1 commit
    • Kirill Smelkov's avatar
      setup: Fix sdist/egg_info/... on Python3 · 3d0f134c
      Kirill Smelkov authored
      Arnaud reports that wendelin.core currently cannot be installed on
      Python3:
      
          /opt/slapgrid/3f9add9291086dee302fc478df4b3130/parts/python3/bin/python3 /tmp/tmp1fuxchsb -q develop -mN -d /opt/slapgrid/3f9add9291086dee302fc478df4b3130/develop-eggs/tmps5jr7ymsbuild
          /opt/slapgrid/3f9add9291086dee302fc478df4b3130/eggs/setuptools-44.1.1-py3.7.egg/setuptools/dist.py:476: UserWarning: Normalizing '2.0.alpha2.post1' to '2.0a2.post1'
          package init file '__init__.py' not found (or not a regular file)
          Traceback (most recent call last):
           File "/tmp/tmp1fuxchsb", line 19, in <module>
             exec(compile(f.read(), '/opt/slapgrid/3f9add9291086dee302fc478df4b3130/parts/wendelin.core/setup.py', 'exec'))
           File "/opt/slapgrid/3f9add9291086dee302fc478df4b3130/parts/wendelin.core/setup.py", line 426, in <module>
             """.splitlines()]
           File "/opt/slapgrid/3f9add9291086dee302fc478df4b3130/develop-eggs/pygolang-0.1-py3.7-linux-x86_64.egg/golang/pyx/build.py", line 118, in setup
             setuptools_dso.setup(**kw)
           File "/opt/slapgrid/3f9add9291086dee302fc478df4b3130/eggs/setuptools_dso-1.7-py3.7.egg/setuptools_dso/__init__.py", line 37, in setup
             _setup(**kws)
           File "/opt/slapgrid/3f9add9291086dee302fc478df4b3130/eggs/setuptools-44.1.1-py3.7.egg/setuptools/__init__.py", line 162, in setup
           File "/opt/slapgrid/3f9add9291086dee302fc478df4b3130/parts/python3/lib/python3.7/distutils/core.py", line 148, in setup
             dist.run_commands()
           File "/opt/slapgrid/3f9add9291086dee302fc478df4b3130/parts/python3/lib/python3.7/distutils/dist.py", line 966, in run_commands
             self.run_command(cmd)
           File "/opt/slapgrid/3f9add9291086dee302fc478df4b3130/parts/python3/lib/python3.7/distutils/dist.py", line 985, in run_command
             cmd_obj.run()
           File "/opt/slapgrid/3f9add9291086dee302fc478df4b3130/eggs/setuptools-44.1.1-py3.7.egg/setuptools/command/develop.py", line 38, in run
           File "/opt/slapgrid/3f9add9291086dee302fc478df4b3130/eggs/setuptools-44.1.1-py3.7.egg/setuptools/command/develop.py", line 136, in install_for_development
           File "/opt/slapgrid/3f9add9291086dee302fc478df4b3130/parts/python3/lib/python3.7/distutils/cmd.py", line 313, in run_command
             self.distribution.run_command(command)
           File "/opt/slapgrid/3f9add9291086dee302fc478df4b3130/parts/python3/lib/python3.7/distutils/dist.py", line 985, in run_command
             cmd_obj.run()
           File "/opt/slapgrid/3f9add9291086dee302fc478df4b3130/eggs/setuptools-44.1.1-py3.7.egg/setuptools/command/egg_info.py", line 296, in run
           File "/opt/slapgrid/3f9add9291086dee302fc478df4b3130/eggs/setuptools-44.1.1-py3.7.egg/setuptools/command/egg_info.py", line 303, in find_sources
           File "/opt/slapgrid/3f9add9291086dee302fc478df4b3130/eggs/setuptools-44.1.1-py3.7.egg/setuptools/command/egg_info.py", line 537, in run
           File "/opt/slapgrid/3f9add9291086dee302fc478df4b3130/eggs/setuptools-44.1.1-py3.7.egg/setuptools/command/egg_info.py", line 591, in prune_file_list
           File "/opt/slapgrid/3f9add9291086dee302fc478df4b3130/eggs/setuptools-44.1.1-py3.7.egg/setuptools/command/egg_info.py", line 452, in prune
           File "/opt/slapgrid/3f9add9291086dee302fc478df4b3130/eggs/setuptools-44.1.1-py3.7.egg/setuptools/command/egg_info.py", line 405, in _remove_files
          TypeError: cannot use a string pattern on a bytes-like object
          While:
           Installing wendelin.core.
      
      The problem turned out to be that git-lsfiles output, that we add into
      list of source files, is bytes and it breaks when those bytes get
      intermixed into strings.
      
      -> Fix it by always returning from runcmd the str type of current python.
      
      /reported-by @arnau
      3d0f134c
  16. 27 Jan, 2022 3 commits
    • Kirill Smelkov's avatar
      Fix build_dso on clean checkout · ad6305c0
      Kirill Smelkov authored
      Similarly to build_ext we need ccan/config.h to be present for dso to
      build. It was not the case and so pip install wendelin.core was failing:
      
          $ pip install wendelin.core-2.0a2.tar.gz
          Processing ./wendelin.core-2.0a2.tar.gz
            Installing build dependencies ... done
            Getting requirements to build wheel ... done
              Preparing wheel metadata ... done
          Collecting ZODB>=4
          ...
          Building wheels for collected packages: wendelin.core
            Building wheel for wendelin.core (PEP 517) ... error
            ERROR: Command errored out with exit status 1:
            ...
            running build_dso
            Building DSOs
            building 'wendelin.bigfile.libvirtmem' DSO as build/lib.linux-x86_64-2.7/wendelin/bigfile/liblibvirtmem.so
            creating build/temp.linux-x86_64-2.7
            creating build/temp.linux-x86_64-2.7/bigfile
            creating build/temp.linux-x86_64-2.7/lib
            x86_64-linux-gnu-gcc -pthread -fno-strict-aliasing -Wdate-time -D_FORTIFY_SOURCE=2 -g -ffile-prefix-map=/build/python2.7-vgIf7a/python2.7-2.7.18=. -fstack-protector-strong -Wformat -Werror=format-security -fPIC -D_GNU_SOURCE -I/tmp/pip-build-env-lfVr7E/overlay/lib/python2.7/site-packages -I. -I./include -I./3rdparty/ccan -I./3rdparty/include -Ibuild/lib.linux-x86_64-2.7/. -c bigfile/pagefault.c -o build/temp.linux-x86_64-2.7/bigfile/pagefault.o -fno-strict-aliasing -std=gnu99 -fplan9-extensions -Wno-declaration-after-statement -Wno-error=declaration-after-statement
            In file included from ./include/wendelin/list.h:11,
                             from ./include/wendelin/bigfile/virtmem.h:50,
                             from bigfile/pagefault.c:29:
            ./3rdparty/ccan/ccan/array_size/array_size.h:4:10: fatal error: config.h: Нет такого файла или каталога
                4 | #include "config.h"
                  |          ^~~~~~~~~~
            compilation terminated.
            error: command 'x86_64-linux-gnu-gcc' failed with exit status 1
            ----------------------------------------
            ERROR: Failed building wheel for wendelin.core
          Failed to build wendelin.core
          ERROR: Could not build wheels for wendelin.core which use PEP 517 and cannot be installed directly
      
      -> Fix it by making build_dso also first come through `make all`.
      
      NOTE we cannot fix it in exactly the same way as for build_ext: if we split
      build_dso into build_dso and ll_build_dso, `make all` will still go to infinite
      recursion: build_dso -> ll_build_dso -> build_dso (not ll_build_dso, this is controlled by setuptools_dso) -> oops.
      ad6305c0
    • Kirill Smelkov's avatar
      wendelin.core v2.0.alpha2 · 5e5ad598
      Kirill Smelkov authored
      5e5ad598
    • Kirill Smelkov's avatar
      wcfs: client: Switch to File IO provided by Pygolang · a36cdcc3
      Kirill Smelkov authored
      Starting from version 0.1 pygolang provides File out of the box:
      
      nexedi/pygolang@4690460b
      https://pypi.org/project/pygolang/#pygolang-change-history
      
      -> Use it and remove our custom File implementation that originally
      served as POC for that pygolang functionality.
      a36cdcc3
  17. 26 Jan, 2022 2 commits
  18. 21 Jan, 2022 6 commits
    • Kirill Smelkov's avatar
      wcfs: Fix crash if on watch request setupWatch needs to access ZODB · 38dde766
      Kirill Smelkov authored
      The problem is similar to a7bf0311 (wcfs: Fix crash if on invalidation
      handledδZ needs to access ZODB) - I forgot to put zhead's transaction into
      context.
      
      Without the fix added test fails as:
      
          wcfs_test.py::test_wcfs_crash_old_data
          ---------------- live log call -----------------
          WARNING  ZODB.FileStorage:FileStorage.py:413 Ignoring index for /tmp/testdb_fs.OV0rS6/1.fs
      
          M: commit -> @at0 (03e5a3342bc5ab22)
      
          M: commit -> @at1 (03e5a3342bc88899)
          M:      f<0000000000000002>     [0]
          INFO     wcfs:__init__.py:293 starting for file:///tmp/testdb_fs.OV0rS6/1.fs ...
          I0120 17:12:10.274379  704327 wcfs.go:2393] start "/dev/shm/wcfs/556fa61a9f9675f34c6b44e1f978842c37176c59" "file:///tmp/testdb_fs.OV0rS6/1.fs"
          I0120 17:12:10.274409  704327 wcfs.go:2399] (built with go1.17.6)
          W0120 17:12:10.274560  704327 storage.go:152] zodb: FIXME: open file:///tmp/testdb_fs.OV0rS6/1.fs: raw cache is not ready for invalidations -> NoCache forced
          INFO     wcfs:__init__.py:334 started pid704327 @ /dev/shm/wcfs/556fa61a9f9675f34c6b44e1f978842c37176c59
      
          C: setup watch f<0000000000000002> @at1 (03e5a3342bc88899)
          #  pinok: {}
      
          M: commit -> @at2 (03e5a3342c895777)
          M:      f<0000000000000002>     [1]
      
          M: commit -> @at3 (03e5a3342ca5ef55)
          M:      f<0000000000000002>     [0]
      
          C: setup watch f<0000000000000002> @at2 (03e5a3342c895777)
          #  pinok: {0: @at1 (03e5a3342bc88899)}
          panic: transaction: no current transaction
      
          goroutine 88 [running]:
          lab.nexedi.com/kirr/neo/go/transaction.currentTxn({0x969718, 0xc0000b6240})
                  /home/kirr/src/neo/src/lab.nexedi.com/kirr/neo/go/transaction/transaction.go:59 +0x77
          lab.nexedi.com/kirr/neo/go/transaction.Current(...)
                  /home/kirr/src/neo/src/lab.nexedi.com/kirr/neo/go/transaction/api.go:206
          lab.nexedi.com/kirr/neo/go/zodb.(*Connection).checkTxnCtx(...)
                  /home/kirr/src/neo/src/lab.nexedi.com/kirr/neo/go/zodb/connection.go:374
          lab.nexedi.com/kirr/neo/go/zodb.(*Connection).Get(0xc0000c25a0, {0x969718, 0xc0000b6240}, 0x4)
                  /home/kirr/src/neo/src/lab.nexedi.com/kirr/neo/go/zodb/connection.go:331 +0x73
          lab.nexedi.com/nexedi/wendelin.core/wcfs/internal/zdata.(*ΔFtail).BlkRevAt(0xc00009dd40, {0x969718, 0xc0000b6240}, 0xc000100540, 0x30, 0x3e5a3342c895777)
                  /home/kirr/src/neo/src/lab.nexedi.com/nexedi/wendelin.core/wcfs/internal/zdata/δftail.go:1140 +0x39d
          main.(*WatchLink).setupWatch(0xc0000120a0, {0x969718, 0xc0000b6240}, 0x2, 0x3e5a3342c895777)
                  /home/kirr/src/neo/src/lab.nexedi.com/nexedi/wendelin.core/wcfs/wcfs.go:1754 +0xe3f
          main.(*WatchLink)._handleWatch(0x0, {0x969718, 0xc0000b6240}, {0xc0000a0122, 0x0})
                  /home/kirr/src/neo/src/lab.nexedi.com/nexedi/wendelin.core/wcfs/wcfs.go:1973 +0x65
          main.(*WatchLink).handleWatch(0x0, {0x969718, 0xc0000b6240}, 0x0, {0xc0000a0122, 0x28})
                  /home/kirr/src/neo/src/lab.nexedi.com/nexedi/wendelin.core/wcfs/wcfs.go:1955 +0x10c
          main.(*WatchLink)._serve.func3({0x969718, 0xc0000b6240})
                  /home/kirr/src/neo/src/lab.nexedi.com/nexedi/wendelin.core/wcfs/wcfs.go:1944 +0x3c
          lab.nexedi.com/kirr/go123/xsync.(*WorkGroup).Go.func1()
                  /home/kirr/src/neo/src/lab.nexedi.com/kirr/go123/xsync/xsync.go:86 +0x68
          created by lab.nexedi.com/kirr/go123/xsync.(*WorkGroup).Go
                  /home/kirr/src/neo/src/lab.nexedi.com/kirr/go123/xsync/xsync.go:83 +0x92
          >>> Change history by file:
      
          f<0000000000000002>:
                                          0 1 2 3 4 5 6 7
                                          a b c d e f g h
                  @at0 (03e5a3342bc5ab22)
                  @at1 (03e5a3342bc88899) 0
                  @at2 (03e5a3342c895777)   1
                  @at3 (03e5a3342ca5ef55) 0
      
          ----------------------------------------
      
                  # wcfs was crashing in setting up watch because of "1" and "2" from above, and
                  # 3. setupWatch was calling ΔFtail.BlkRevAt without putting zhead's transaction into ctx.
                  wl2 = t.openwatch()
          >       wl2.watch(zf, at2, {0:at1})
      38dde766
    • Kirill Smelkov's avatar
      wcfs: zdata: ΔFtail tests: Fix/Adjust debug dump for computed blkRevAt · ca3e54e2
      Kirill Smelkov authored
      - put into if block to avoid collision with already-defined-elsewhere blkv
      - show revisions in symbolic form
      
      Noticed while working on recent change to allow ΔFtail/ΔBtail
      point-queries with at=tail.
      ca3e54e2
    • Kirill Smelkov's avatar
      wcfs: tests: Exercise watching @at0 · 769b1c06
      Kirill Smelkov authored
      Watching with at=tail is inevitable as explained in the previous patch.
      769b1c06
    • Kirill Smelkov's avatar
      wcfs: Adjust ΔFtail/ΔBtail to allow point-queries with at=tail · ef10f820
      Kirill Smelkov authored
      This is needed because when e.g. wcfs is just started the coverage of
      ΔFtail is (head,head] i.e. empty, and if user wants to setup a watch
      with at=head, it becomes watch with at=tail. Then that at is used in a
      query and if point-queries with at=tail are disallowed it panics with
      "at out of bounds".
      
      This fixes crashes in test_wcfs_watch_setup (see 339f1884 "wcfs: tests:
      Always start tDB with ZBigFile pre-created before WCFS startup") and in
      test_wcfs_crash_old_data (see 97ce5105 "wcfs: tests: Add test do
      demonstrate "at out of bounds" crash on readPinWatchers ->
      ΔFtail.BlkRevAt")
      
      For the reference zodb.ΔTail already allows point queries with at=tail:
      
      https://lab.nexedi.com/kirr/neo/blob/1193c44e/go/zodb/δtail.go#L202-206
      https://lab.nexedi.com/kirr/neo/blob/1193c44e/go/zodb/δtail.go#L225-228
      ef10f820
    • Kirill Smelkov's avatar
      wcfs: tests: Add test do demonstrate "at out of bounds" crash on readPinWatchers -> ΔFtail.BlkRevAt · 97ce5105
      Kirill Smelkov authored
      The codepath that sends pin messages to watchers on FUSE READ, similarly
      to what was showed in 339f1884 is also vulnerable to "at out of bounds"
      panic if at=ΔFtail.tail:
      
          wcfs_test.py::test_wcfs_crash_old_data
          ---------------- live log call -----------------
          WARNING  ZODB.FileStorage:FileStorage.py:413 Ignoring index for /tmp/testdb_fs.nbSKXu/1.fs
      
          M: commit -> @at0 (03e5a31e5e5ef6bb)
      
          M: commit -> @at1 (03e5a31e5e63fa77)
          M:      f<0000000000000002>     [0]
          INFO     wcfs:__init__.py:293 starting for file:///tmp/testdb_fs.nbSKXu/1.fs ...
          I0120 16:50:22.136098  697106 wcfs.go:2393] start "/dev/shm/wcfs/93026d44ef96f87df2cc0e2e451c5aabee91b652" "file:///tmp/testdb_fs.nbSKXu/1.fs"
          I0120 16:50:22.136127  697106 wcfs.go:2399] (built with go1.17.6)
          W0120 16:50:22.136233  697106 storage.go:152] zodb: FIXME: open file:///tmp/testdb_fs.nbSKXu/1.fs: raw cache is not ready for invalidations -> NoCache forced
          INFO     wcfs:__init__.py:334 started pid697106 @ /dev/shm/wcfs/93026d44ef96f87df2cc0e2e451c5aabee91b652
      
          C: setup watch f<0000000000000002> @at1 (03e5a31e5e63fa77)
          #  pinok: {}
          panic: at out of bounds: at: @03e5a31e5e63fa77,  (tail, head] = (@03e5a31e5e63fa77, @03e5a31e5e63fa77]
      
          goroutine 7 [running]:
          lab.nexedi.com/nexedi/wendelin.core/wcfs/internal/zdata.panicf(...)
                  /home/kirr/src/neo/src/lab.nexedi.com/nexedi/wendelin.core/wcfs/internal/zdata/misc.go:47
          lab.nexedi.com/nexedi/wendelin.core/wcfs/internal/zdata.(*ΔFtail).BlkRevAt(0xc0000a5d40, {0x969718, 0xc000076140}, 0xc0001a22a0, 0xc0001c0200, 0x3e5a31e5e63fa77)
                  /home/kirr/src/neo/src/lab.nexedi.com/nexedi/wendelin.core/wcfs/internal/zdata/δftail.go:1077 +0xa45
          main.(*BigFile).readPinWatchers(0xc0001d0200, {0x969718, 0xc000076140}, 0x0, 0xffffffffffffffff)
                  /home/kirr/src/neo/src/lab.nexedi.com/nexedi/wendelin.core/wcfs/wcfs.go:1559 +0x2a5
          main.(*BigFile).readBlk(0xc0001d0200, {0x969718, 0xc000076140}, 0x0, {0xc000320000, 0x200000, 0x0})
                  /home/kirr/src/neo/src/lab.nexedi.com/nexedi/wendelin.core/wcfs/wcfs.go:1281 +0x4d2
          main.(*BigFile).Read.func1({0x969718, 0xc000076140})
                  /home/kirr/src/neo/src/lab.nexedi.com/nexedi/wendelin.core/wcfs/wcfs.go:1223 +0x71
          lab.nexedi.com/kirr/go123/xsync.(*WorkGroup).Go.func1()
                  /home/kirr/src/neo/src/lab.nexedi.com/kirr/go123/xsync/xsync.go:86 +0x68
          created by lab.nexedi.com/kirr/go123/xsync.(*WorkGroup).Go
                  /home/kirr/src/neo/src/lab.nexedi.com/kirr/go123/xsync/xsync.go:83 +0x92
          >>> Change history by file:
      
          f<0000000000000002>:
                                          0 1 2 3 4 5 6 7
                                          a b c d e f g h
                  @at0 (03e5a31e5e5ef6bb)
                  @at1 (03e5a31e5e63fa77) 0
      
          ...
      
              @func
              def test_wcfs_crash_old_data():
                  # start wcfs with ΔFtail/ΔBtail not covering that initial data.
                  t = tDB(old_data=[{0:'a'}]); zf = t.zfile; at1 = t.head
                  defer(t.close)
      
                  f = t.open(zf)
      
                  # ΔFtail coverage is currently (at1,at1]
                  wl = t.openwatch()
                  wl.watch(zf, at1, {})
      
                  # wcfs is crashing on readPinWatcher -> ΔFtail.BlkRevAt with
                  #   "at out of bounds: at: @at1,  (tail,head] = (@at1,@at1]
                  # because BlkRevAt(at=tail) query was disallowed.
          >       f.assertBlk(0, 'a')          # [0] becomes tracked
      
      Still also crashing in test_wcfs_watch_setup.
      97ce5105
    • Kirill Smelkov's avatar
      wcfs: tests: Move tests for crashing WCFS due to old data to dedicated section · 67519be7
      Kirill Smelkov authored
      Soon this test will also exercise functionality from isolation protocol
      as well and so it will stop to be basic.
      
      Move plus rename test_wcfs_basic_invalidation_wo_dFtail_coverage ->
      test_wcfs_crash_old_data.
      
      Still crashing in test_wcfs_watch_setup.
      67519be7