1. 05 May, 2026 1 commit
    • Levin Zimmermann's avatar
      wcfs: Teach safe multi-user support · 52d6e54e
      Levin Zimmermann authored and Kirill Smelkov's avatar Kirill Smelkov committed
      By default, FUSE restricts access to the mounted filesystem to the
      user who performed the mount. This prevents other users from accessing
      WCFS, limiting multi-user deployments.
      
      FUSE's 'allow_other' option [1] enables access for all users, but this
      can create a security risk on systems where only some users are trusted.
      
      This patch introduces a new '-sharewith' flag that allows specifying
      an OS group with which WCFS access is shared. 'allow_other' is only
      enabled if this flag is set, preventing unintentional exposure.
      
      NOTE Automatically testing this feature is difficult because it requires
      privileged operations. Therefore, this patch adds a manual test at
      'wcfs/testprog/wcfs_verify_permissions.py'.
      
      [1] See 'allow_other' option at
          https://docs.kernel.org/filesystems/fuse/fuse.html
      
      --------
      kirr:
      
      - redevelop the test almost from scratch to be run automatically via unshare + subordinate uid/gid
      - also fix mode for directories, not only for files
      - activate "default...
      52d6e54e
  2. 18 Mar, 2026 4 commits
  3. 30 Jan, 2026 2 commits
  4. 17 Nov, 2025 6 commits
    • Kirill Smelkov's avatar
      wcfs: Don't pinkill clients that become killed by OS or otherwise exit when... · 87756f06
      Kirill Smelkov authored
      wcfs: Don't pinkill clients that become killed by OS or otherwise exit when they receive pin notification
      
      When a block becomes changed by new transaction WCFS notifies clients,
      that already have that block mmaped, about the need to remmap the block
      to particular revision with pin notification over watchlink channel.
      That pin notification mechanism should be collectively and cooperatively
      followed by both WCFS and _all_ clients for it to work reliably for WCFS
      to never enter a situation when corrupt data are provided to a client.
      The mechanism involves acknowledgements from clients and WCFS awaits for
      that acknowledgement before unpausing _all_ clients that try to read the
      block simultaneously(*). So if a client is slow to respond to pin
      notification, or does not respond at all - it creates "progress" problem
      for everyone else and the system can get stuck.
      
      To solve that problem WCFS implements protection against slow / faulty
      clients: such a client becomes killed by WCFS with SIGBUS to unblock the
      system and to make progress while maintaining the invariant that all
      alive clients are provided with correct data. That logic was implemented
      in c559ec1a (wcfs: Implement protection against faulty client) and is
      documented in (+). It works, but there is one peculiarity: if a client,
      upon receiving pin notification, becomes killed by separate mechanism,
      e.g. via kill signal from OS, WCFS still wants to kill that client and
      logs a huge warning about that, because stuck/incorrect clients are
      considered unnormal.
      
      -> Fix emission of that warning by first checking whether client is
         still alive when "bad pin reply" condition is detected, and avoiding the
         warning if the client is not there anymore.
      
      For the reference here is what happens when a client gets killed by OS:
      
      - the OS kernel starts to close all file descriptors of client process
      - which invokes closure of opened head/watch handles
      - which might trigger on WCFS side e.g. "peer closed its end" error when
        WCFS tries to send pin notification, or "unexpected EOF" / "await
        canceled" when WCFS awaits for the reply.
      - the OS kernel switched process' state to Z (Zombie) in process table
      
      This patch takes the following patch by Levin into account: levin.zimmermann/wendelin.core@ff8a2d1a .
      
      (*) Isolation protocol description: https://lab.nexedi.com/nexedi/wendelin.core/-/blob/c0ffbcda/wcfs/wcfs.go#L93-183
      (+) Protection against slow or faulty clients: https://lab.nexedi.com/nexedi/wendelin.core/-/blob/c0ffbcda/wcfs/wcfs.go#L186-217
      
      
      
      Co-authored-by: Levin Zimmermann's avatarLevin Zimmermann <levin.zimmermann@nexedi.com>
      /reviewed-on !33
      87756f06
    • Levin Zimmermann's avatar
      wcfs: Add tests to show WCFS kills dead or dying clients · bc2f2425
      Levin Zimmermann authored and Kirill Smelkov's avatar Kirill Smelkov committed
      When a WCFS client doesn't respond to a pin request in time, the server
      attempts to kill it [1]. However, there are cases where the client may
      stop for unrelated reasons (e.g. being restarted by another program)
      after the pin request is sent. In such situations, WCFS should not
      forcefully kill the client, as this leads to misleading logs suggesting
      the client was faulty, when in fact it was simply restarted.
      
      This commit adds tests that reproduce these scenarios and verify
      that WCFS only kills clients when truly necessary.
      
      Note: these tests currently fail, as WCFS still kills dead or dying clients.
      
      [1] nexedi/wendelin.core@c559ec1a
      
      --------
      kirr:
      
      - use os._exit instead of sys.exit to simulate OS-level process kill
      - add 2·pinkill sleep before verifying that the process was not pinkilled by wcfs
      - mark added test with xfail
      - don't duplicate code
      - cosmetics
      
      Added test currently fails as e.g.
      
          INFO     wcfs:__init__.py:301 starting for file:///tmp/testdb_fs.01bOZy/1.fs ...
          I1110 15:21:59.221350 2938668 wcfs.go:2754] start "/dev/shm/wcfs/5d8d6942d7f39fa05fe1024e4c8a8c21a44e1254" "file:///tmp/testdb_fs.01bOZy/1.fs"
          I1110 15:21:59.221419 2938668 wcfs.go:2760] (built with go1.25.4)
          W1110 15:21:59.221546 2938668 15:21] 9.221542 zodb: FIXME: open file:///tmp/testdb_fs.01bOZy/1.fs: raw cache is not ready for invalidations -> NoCache forced
          INFO     wcfs:__init__.py:343 started pid2938668 @ /dev/shm/wcfs/5d8d6942d7f39fa05fe1024e4c8a8c21a44e1254
      
          M: commit -> @at1 (0404bfc5fd8b75ee)
          M:      f<0000000000000010>     [2]
      
          M: commit -> @at2 (0404bfc5fdadfd00)
          M:      f<0000000000000010>     [2]
      
          C: setup watch f<0000000000000010> @at1 (0404bfc5fd8b75ee)
          #  pinok: {2: @at1 (0404bfc5fd8b75ee)}
          E1110 15:22:00.463181 2938668 wcfs.go:1603] pid2938691: client failed to handle pin notification correctly and timely in 3s: pin #2 @0404bfc5fd8b75ee: sendReq: waiting for reply: context canceled
          E1110 15:22:00.463203 2938668 wcfs.go:1603] pid2938691: -> killing it because else 1) all other clients will remain stuck, and 2) we no longer can provide correct data to the faulty client.
          E1110 15:22:00.463209 2938668 wcfs.go:1603] pid2938691:    (see "Protection against slow or faulty clients" in wcfs description for details)
          E1110 15:22:00.463217 2938668 wcfs.go:1642] pid2938691: <- SIGBUS
          E1110 15:22:00.463394 2938668 wcfs.go:1603] pid2938691: terminated
          E1110 15:22:00.463420 2938668 wcfs.go:2085] wlink 1: serve rx: unexpected EOF
          >>> Change history by file:
      
          f<0000000000000010>:
                                          0 1 2 3 4 5 6 7
                                          a b c d e f g h
                  @at0 (0404bfc5fca35122)
                  @at1 (0404bfc5fd8b75ee)     2
                  @at2 (0404bfc5fdadfd00)     2
      
          INFO     wcfs:__init__.py:418 unmount/stop wcfs pid2938668 @ /dev/shm/wcfs/5d8d6942d7f39fa05fe1024e4c8a8c21a44e1254
          I1110 15:22:09.606301 2938668 wcfs.go:2942] stop "/dev/shm/wcfs/5d8d6942d7f39fa05fe1024e4c8a8c21a44e1254" "file:///tmp/testdb_fs.01bOZy/1.fs"
          FAILED
      
          ============================================ FAILURES =============================================
          ___________________ test_wcfs_pinhfaulty_kill_on_watch[_bad_watch_stop_on_pin] ____________________
      
          faulty = <function _bad_watch_stop_on_pin at 0x7f45ee31aad0>, with_prompt_pintimeout = None
      
              @mark.parametrize('faulty', [
                  _bad_watch_no_pin_read,
                  _bad_watch_no_pin_reply,
                  _bad_watch_stop_on_pin,
                  _bad_watch_eof_pin_reply,
                  _bad_watch_nak_pin_reply,
              ])
              @func
              def test_wcfs_pinhfaulty_kill_on_watch(faulty, with_prompt_pintimeout):
                  t = tDB(multiproc=True); zf = t.zfile
                  defer(t.close)
      
                  at1 = t.commit(zf, {2:'c1'})
                  at2 = t.commit(zf, {2:'c2'})
                  f = t.open(zf)
                  f.assertData(['','','c2'])
      
                  # launch faulty process that should be killed by wcfs on problematic pin during watch setup
                  p = tFaultySubProcess(t, faulty, at=at1)
                  defer(p.close)
                  t.assertStats({'pinkill': 0})
      
                  # wait till faulty client issues its watch, receives pin and pauses/misbehaves
                  p.send("start watch")
                  if faulty != _bad_watch_no_pin_read:
                      assert p.recv(t.ctx) == b"pin %s #%d @%s" % (h(zf._p_oid), 2, h(at1))
      
                  # issue our watch request - it should be served well and without any delay
                  wl = t.openwatch()
                  wl.watch(zf, at1, {2:at1})
      
                  # the faulty client must become killed by wcfs
                  # but client that stops itself, must not be killed
                  must_kill = (faulty != _bad_watch_stop_on_pin)
                  if not must_kill:
                      # give time to wcfs to detect wlink close and potentially initiate pinkill
                      # do not wait on the process yet, so it remains in the OS process table in Z state
                      xsleep(t.ctx, 2*t.pintimeout)
                  p.join(t.ctx)
                  assert p.exitcode is not None
          >       t.assertStats({'pinkill': int(must_kill)})
      
          wcfs_faultyprot_test.py:213:
          _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _
      
          t = <wendelin.wcfs.wcfs_test.tDB object at 0x7f45edf9dd70>, kvok = {'pinkill': 0}
      
              def assertStats(t, kvok):
                  # kstats loads stats subset with kvok keys.
                  def kstats():
                      stats = t._loadStats()
                      kstats = {}
                      for k in kvok.keys():
                          kstats[k] = stats.get(k, None)
                      return kstats
      
                  # wait till stats reaches expected state
                  ctx = timeout()
                  while 1:
                      kv = kstats()
                      if kv == kvok:
                          break
                      if ctx.err() is not None:
          >               assert kv == kvok, "stats did not reach expected state"
          E               AssertionError: stats did not reach expected state
          E               assert {'pinkill': 1} == {'pinkill': 0}
          E                 Differing items:
          E                 {'pinkill': 1} != {'pinkill': 0}
          E                 Full diff:
          E                 - {'pinkill': 1}
          E                 ?             ^
          E                 + {'pinkill': 0}
          E                 ?             ^
      
          wcfs_test.py:478: AssertionError
      
      Original patch: levin.zimmermann/wendelin.core@8d3fa76d
      
      /reviewed-by @kirr
      /reviewed-on nexedi/wendelin.core!33
      bc2f2425
    • Kirill Smelkov's avatar
      wcfs: tests: Increase each test timeout from 10s to 15s · c1448476
      Kirill Smelkov authored
      Each wcfs test is run under timeout to detect e.g. that something is
      stuck and to unmount the filesystem forcibly on such case. That overall
      timeout is much less compared to regular 30s pinkill timeout. And is
      much high compared to 3s pinkill timeout used in faulty-protection tests.
      
      The faulty protection tests usually need to wait for a kill to happen
      for 2·pinkill timeout to reliably detect that event in the presence of
      surrounding OS load and that was working quite ok so far because the
      actual kill was usually happening in around 1·pinkill since start of
      the waiting. However in the next patch we will need to test whether wcfs
      does _not_ kill an innocent client, similarly waiting for that 2·pinkill
      time, but here it will be full 2·pinkill sleep without trimming because
      for negative condition (does _not_ kill) we need to predictably wait
      much longer compared to pinkill time. With that 6s just for the sleep,
      and test setup and other overhead things start to trigger overall
      timeout quite frequently.
      
      -> Increase that overall timeout from 10s to 15s to cover that "need to
         wait longer" situation while still maintaining invariant for the timeout
         to stay much less of regular pinkill time and much more of faultyprot
         tests pinkill time.
      
      /reviewed-by @levin.zimmermann
      /reviewed-on nexedi/wendelin.core!33
      c1448476
    • Kirill Smelkov's avatar
      wcfs: tests: faultyprot: Clarify why raw-level pin handler sends trimmed message to its supervisor · a5251341
      Kirill Smelkov authored
      Both WatchLink-level and raw-level faulty-protection tests receive pin
      messages and send their content to supervisor process. Tests that work
      at WatchLink-level receive pin-messages via WatchLink.recvReq that look
      like:
      
      	pin <bigfileX> #<blk> @<rev>
      
      however raw-reading from opened head/watch handle returns same string
      prefixed with message-ID and suffixed with \n:
      
      	<message-id> pin <bigfileX> #<blk> @<rev> \n
      
      The supervisor process does not care transport-level detail and wants to
      observe only the semantic on which pin message was received. The
      high-level tests already match that by sending exactly what
      WatchLink.recvReq gave them, while the raw-level test needs to trim
      received line to match that.
      
      -> Add corresponding comment to make that clear.
      
      /reviewed-by @levin.zimmermann
      /reviewed-on nexedi/wendelin.core!33
      a5251341
    • Kirill Smelkov's avatar
      wcfs: tests: faultyprot: Remove code-duplication in between "no pin read" and "eof pin reply" tests · dc5b1fca
      Kirill Smelkov authored
      In wcfs_faultyprot_test.py we have tests that exercise wcfs behaviour
      against faulty clients that do not handle pin notifications well. There
      are several scenarios tested. The scenarios that do the tests at
      WatchLink level already use common functions to setup the watchlink and
      perform shared actions. However tests that exercise behaviour with watchlink
      being opened at raw level with regular open syscall instead of using
      WatchLink class, were duplicating code for their setup. Those tests were
      added in c91fb14e (wcfs: tests: Extend faulty protection tests with more
      kinds of faulty clients).
      
      -> Refactor the code to remove duplication because soon we will need to
         add more tests that exercise behaviour with raw-level IO on
         watchlink, and so we need to first bring order and structure as a
         preparatory step for that.
      
      Plain code refacoring without semantic change.
      
      /reviewed-by @levin.zimmermann
      /reviewed-on nexedi/wendelin.core!33
      dc5b1fca
    • Kirill Smelkov's avatar
      wcfs/*: Fix typos · 4b46f660
      Kirill Smelkov authored
      /reviewed-by @levin.zimmermann
      /reviewed-on !33
      4b46f660
  5. 11 Nov, 2025 1 commit
    • Levin Zimmermann's avatar
      wcfs: v↑ go123 · 6aef68d0
      Levin Zimmermann authored and Kirill Smelkov's avatar Kirill Smelkov committed
      This updates the version of go123. The new version supports Go 1.25,
      which is required because we want to compile wendelin.core with
      Go ≥ 1.24 to fix a memory leak in NEO/go [1].
      
      [1] See kirr/neo!11 for more
          context.
      
      /reviewed-by @kirr
      /reviewed-on !45
      6aef68d0
  6. 26 Sep, 2025 1 commit
    • Kirill Smelkov's avatar
      wcfs: _os: Fix gettid for older glibc · c0ffbcda
      Kirill Smelkov authored
      Kazuhiko reports that
      
          On one of my servers, from wendelin.bigarray.array_zodb import ZBigArray fails.
      
          >>> from wendelin.bigarray.array_zodb import ZBigArray
          Traceback (most recent call last):
            File "<console>", line 1, in <module>
            File "/(SR)/parts/wendelin.core/bigarray/array_zodb.py", line 32, in <module>
              from wendelin.bigfile.file_zodb import ZBigFile
            File "/(SR)/parts/wendelin.core/bigfile/file_zodb.py", line 166, in <module>
              from wendelin.bigfile._file_zodb import _ZBigFile
            File "bigfile/_file_zodb.pyx", line 1, in init wendelin.bigfile._file_zodb
              # -*- coding: utf-8 -*-
            File "/(SR)/parts/wendelin.core/wcfs/__init__.py", line 87, in <module>
              from wendelin.wcfs.internal import glog
            File "/(SR)/parts/wendelin.core/wcfs/internal/glog.py", line 31, in <module>
              from wendelin.wcfs.internal import os as xos
            File "/(SR)/parts/wendelin.core/wcfs/internal/os.py", line 32, in <module>
              from wendelin.wcfs.internal._os import gettid
          ImportError: /(SR)/parts/wendelin.core/wcfs/internal/_os.so: undefined symbol: gettid
      
      That happens because gettid is available only starting from glibc 2.30
      
          https://www.man7.org/linux/man-pages/man2/gettid.2.html
      
      and the system there is likely older.
      
      We already use gettid via syscall(SYS_gettid) in wcfs/client/wcfs_misc.cpp:
      
          https://lab.nexedi.com/nexedi/wendelin.core/-/blob/e8a00ac0/wcfs/client/wcfs_misc.cpp#L254
      
      -> Do the same thing in wcfs/internal/_os.pyx to fix it.
      
      /reported-by @kazuhiko
      /reported-on nexedi/erp5@c8998ed0 (comment 244984)
      /helped-and-reviewed-by @jerome
      /reviewed-on nexedi/wendelin.core!44
      c0ffbcda
  7. 11 Jul, 2025 13 commits
  8. 09 Jun, 2025 2 commits
    • Kirill Smelkov's avatar
      wcfs: os: Fix Proc.fd to handle EPERM properly · c93074ac
      Kirill Smelkov authored
      While reviewing the previous patch I noticed that Proc.fd misbehaves
      when there is permission error:
      
          ---- 8< ---- (x.py)
          from wendelin.wcfs.internal import os as xos
          pdbc = xos.ProcDB.open()
          ---- 8< ----
      
          (neo) (py311.venv) (g.env) kirr@deca:~/src/neo/src/lab.nexedi.com/nexedi/wendelin.core$ python x.py
          Traceback (most recent call last):
            File "/home/kirr/src/wendelin/wendelin.core/x.py", line 3, in <module>
              pdbc = xos.ProcDB.open()
                     ^^^^^^^^^^^^^^^^^
            File "/home/kirr/src/wendelin/venv/py311.venv/lib/python3.11/site-packages/decorator.py", line 235, in fun
              return caller(func, *(extras + args), **kw)
                     ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
            File "/home/kirr/src/tools/go/pygolang-master/golang/__init__.py", line 166, in _goframe
              return f(*argv, **kw)
                     ^^^^^^^^^^^^^^
            File "/home/kirr/src/wendelin/wendelin.core/wcfs/internal/os.py", line 330, in open
              proc.get(name)
            File "/home/kirr/src/wendelin/venv/py311.venv/lib/python3.11/site-packages/decorator.py", line 235, in fun
              return caller(func, *(extras + args), **kw)
                     ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
            File "/home/kirr/src/tools/go/pygolang-master/golang/__init__.py", line 166, in _goframe
              return f(*argv, **kw)
                     ^^^^^^^^^^^^^^
            File "/home/kirr/src/wendelin/wendelin.core/wcfs/internal/os.py", line 590, in get
              v = eraise(v)
                  ^^^^^^^^^
            File "/home/kirr/src/wendelin/wendelin.core/wcfs/internal/os.py", line 544, in eraise
              raise e
            File "/home/kirr/src/wendelin/wendelin.core/wcfs/internal/os.py", line 563, in get
              v = f(proc)
                  ^^^^^^^
            File "/home/kirr/src/wendelin/wendelin.core/wcfs/internal/os.py", line 704, in fd
              ifd.pos     = int(e.pop("pos"))
                                ^
          UnboundLocalError: cannot access local variable 'e' where it is not associated with a value
      
      The problem happens because Proc.fd, like many other methods catches
      OSError+IOError to see if it was transient ENOENT, but forgets to
      reraise the exception if it was not.
      
      -> Fix that.
      
      I also checked all other places that do such OSErrror+IOError filtering
      and Proc.fd was the only one to miss to reraise.
      
      Thorough tests for ProcDB and MountDB are still TODO.
      
      Fixes 7932bac5 (wcfs: os: Add ProcDB & co)
      
      /reviewed-by @levin.zimmermann
      /reviewed-on !40 (comment 237564)
      c93074ac
    • Levin Zimmermann's avatar
      wcfs: os: Fix fetching file descriptor information for kernel<5.14 · acb8ad59
      Levin Zimmermann authored and Kirill Smelkov's avatar Kirill Smelkov committed
      As described by Kirill, Linux kernels older than 5.14 do not yet support
      the 'ino' entry:
      
      "However after rechecking it looks like the ino entry was added only
      "recently" in 2021 in Linux 5.14
      (https://git.kernel.org/linus/3845f256a8b5)." [1]
      
      Therefore we need to make the fetching of the 'ino' entry optional to
      support older kernels.
      
      [1] nexedi/slapos!1815 (comment 236971)
      
      --------
      kirr: I checked the whole codebase and we do not use fd.ino anywhere
      yet, so it is ok to change the interface at this time.
      
      /reviewed-by @kirr
      /reviewed-on nexedi/wendelin.core!40
      acb8ad59
  9. 03 Jun, 2025 2 commits
    • Kirill Smelkov's avatar
      fixup! wcfs: py: Log with date and time present · 564e0986
      Kirill Smelkov authored
      Thomas reports that wcfs crashes in glog.basicConfig on py2 and indeed
      checking things it looks like this:
      
          wendelin.core/D$ python --version
          Python 2.7.18
      
          wendelin.core/D$ wcfs status file://`pwd`/1.fs
          Traceback (most recent call last):
            File "/home/kirr/src/wendelin/venv/z-dev/bin/wcfs", line 11, in <module>
              load_entry_point('wendelin.core', 'console_scripts', 'wcfs')()
            File "<decorator-gen-42>", line 2, in main
            File "/home/kirr/src/tools/go/pygolang/golang/__init__.py", line 166, in _goframe
              return f(*argv, **kw)
            File "/home/kirr/src/wendelin/wendelin.core/wcfs/__init__.py", line 1108, in main
              glog.basicConfig(stream=sys.stderr, level=logging.INFO)
            File "/home/kirr/src/wendelin/wendelin.core/wcfs/internal/glog.py", line 36, in basicConfig
              logging.setLogRecordFactory(LogRecord)
          AttributeError: 'module' object has no attribute 'setLogRecordFactory'
      
      This happens because while py3 logging has setLogRecordFactory py2
      logging does not.
      
      -> Fix that by changing logging.LogRecord on py2 directly.
      
      However fresh review of glog.py module reveals more problems:
      
      On py2 there is no Formatter.formatMessage and so any logging attempt
      crashes with
      
          Traceback (most recent call last):
            File "/usr/lib/python2.7/logging/__init__.py", line 868, in emit
              msg = self.format(record)
            File "/usr/lib/python2.7/logging/__init__.py", line 741, in format
              return fmt.format(record)
            File "/usr/lib/python2.7/logging/__init__.py", line 469, in format
              s = self._fmt % record.__dict__
          KeyError: 'levelchar'
      
      -> Fix that by moving .levelchar initialization to LogRecord constructor.
      
      Another problem is that glog.basicConfig was ignoring level argument.
      This way even if the user code from wcfs was invoking it with
      level=logging.INFO (see f9a40d36 "wcfs: py: Switch loglevel from WARNING
      -> INFO for wcfs.py commands") no log messages from info level were
      logged.
      
      -> Fix that by setting root's logger level as instructed.
      
      I'm not sure how I missed all those problem when preparing original patch.
      
      After hereby patch wcfs.py logging is hopefully back to be working
      properly on both py2 and py3.
      
      /fixes e51bef0d (wcfs: py: Log with date and time present)
      /reported-by @tomo
      /reported-on nexedi/slapos!1815 (comment 236620)
      /reviewed-by @levin.zimmermann
      /reviewed-on nexedi/wendelin.core!38
      564e0986
    • Levin Zimmermann's avatar
      wcfs: tests: Test WCFS CLI · 2dbce639
      Levin Zimmermann authored and Kirill Smelkov's avatar Kirill Smelkov committed
      nexedi/wendelin.core!38 attempts
      to fix bugs introduced with nexedi/wendelin.core@e51bef0d.
      We didn't see these bugs in test results, because the WCFS CLI codepath
      is not covered in our tests. This patch adds coverage of this codepath
      to increase the likelihood that issues are quickly detected.
      
      --------
      kirr: Rework original Levin's patch not to pollute global state of test
      process with e.g. glog logging setup. Original patch is here:
      3cb3872f
      
      Added test currently fails with
      
          wcfs/wcfs_test.py::test_wcfs_main
          Exception in subprocess wcfs.wcfs_test._test_wcfs_main (pid936172):
          Traceback (most recent call last):
            File "/home/kirr/src/wendelin/wendelin.core/wcfs/internal/multiprocessing.py", line 84, in _start
              r = f(*argv, **kw)
            File "wcfs/wcfs_test.py", line 2047, in _test_wcfs_main
              wcfs.main()
            File "<decorator-gen-42>", line 2, in main
            File "/home/kirr/src/tools/go/pygolang/golang/__init__.py", line 166, in _goframe
              return f(*argv, **kw)
            File "/home/kirr/src/wendelin/wendelin.core/wcfs/__init__.py", line 1108, in main
              glog.basicConfig(stream=sys.stderr, level=logging.INFO)
            File "/home/kirr/src/wendelin/wendelin.core/wcfs/internal/glog.py", line 36, in basicConfig
              logging.setLogRecordFactory(LogRecord)
          AttributeError: 'module' object has no attribute 'setLogRecordFactory'
      
          ...
      
                  zurl = "file://abc"
                  _("serve",  ("-arg0", zurl),
          >                   (zurl, ("-arg0",)))
      
          ...
      
                  if end.exc is not None:
          >           raise end.exc
          E           AttributeError: 'module' object has no attribute 'setLogRecordFactory'
      
      it will be fixed by the next patch.
      
      /reviewed-by @kirr, @levin.zimmermann
      /reviewed-on nexedi/wendelin.core!38, nexedi/wendelin.core!39
      2dbce639
  10. 21 May, 2025 8 commits
    • Kirill Smelkov's avatar
      bigfile: Fix "unused variable" warnings · 7456ba12
      Kirill Smelkov authored
          bigfile/_bigfile.c: In function ‘pyfileh_dealloc’:
          bigfile/_bigfile.c:496:18: warning: unused variable ‘pyfile’ [-Wunused-variable]
            496 |     PyBigFile   *pyfile;
                |                  ^~~~~~
          bigfile/_bigfile.c:495:18: warning: unused variable ‘file’ [-Wunused-variable]
            495 |     BigFile     *file    = fileh->file;
                |                  ^~~~
      
      These were there from the beginning started in 35eb95c2 (bigfile: Python
      wrapper around virtual memory subsystem).
      7456ba12
    • Kirill Smelkov's avatar
      wcfs: status + stop robustification · 86a74ffe
      Kirill Smelkov authored
      Hello @levin.zimmermann.
      
      This are the patches from our late-2024 trial to deploy WCFS which I think are already an improvement and ok to go.
      
      Please see the patches for details.
      
      Kirill
      
      /reviewed-by @levin.zimmermann
      /reviewed-on !36
      86a74ffe
    • Kirill Smelkov's avatar
      wcfs: Fix and enhance `wcfs stop` to be more reliable · 82f0eb4b
      Kirill Smelkov authored
      Since wcfs beginning - since e3f2ee2d (wcfs: Initial implementation of
      basic filesystem) `wcfs stop` was implemented as just `fusermount -u`.
      That, however, turned out to be not robust because if wcfs is
      deadlocked, unmounting hangs, and if wcfs server is crashed, but there
      are still running client processes, unmount will fail with "Device or
      resource busy" error.
      
      For the deadlocked case we often see a situation where both wcfs and
      client zope processes are hung, kill -9 does not work on them (they
      still remain hung) and there is no easy way to do the unmount and
      restart wcfs.
      
      -> Fix `wcfs stop` to do that by first breaking the deadlock via
      /sys/fs/fuse/connection/<X>/abort and making sure that:
      
      1) wcfs.go is not running,
      2) all left clients are terminated, and
      3) the mount is also gone
      
      In many ways this coincides with what Server.stop was already doing, so
      here we teach `wcfs stop` to work via that Server.stop codepath and
      adjust the latter to work ok if Server._proc is not only
      subprocess.Popen that current process spawned, but also an xos.Proc,
      that `wcfs stop` discovered. Which can be also None if wcfs.go crashed
      by itself.
      
      As explained in the comments I took the decision to kill client
      processes instead of doing the final unmount try lazily because
      
          # NOTE if we do `fusermount -uz` (lazy unmount = MNT_DETACH), we will
          #      remove the mount from filesystem tree and /proc/mounts, but the
          #      clients will be left alive and using the old filesystem which is
          #      left in a bad ENOTCONN state. From this point of view restart of
          #      the clients is more preferred compared to leaving them running
          #      but actually disconnected from the data.
          #
          # TODO try to teach wcfs clients to detect filesystem in ENOTCONN state
          #      and reconnect automatically instead of being killed. Then we could
          #      use MNT_DETACH.
      
      TODO tests.
      
      Levin also notes at nexedi/wendelin.core!36 (comment 233312)
      
          It would probably indeed be nicer, if `wcfs stop` wouldn't need to kill
          clients. But since wcfs.go already needs to send signals to clients (and
          we already need to set capacities), I too don't think it's urgent to
          teach WCFS clients to detect filesystem in ENOTCONN state.
      
      /reviewed-by @levin.zimmermann
      /reviewed-on nexedi/wendelin.core!36
      82f0eb4b
    • Kirill Smelkov's avatar
      wcfs: Server.stop: Dump kernel traceback of wcfs + lsof + ... if wcfs is stuck · cf1f16f6
      Kirill Smelkov authored
      Use Sever._stuckdump we just added for `wcfs status` in the previous
      patch to dump that useful information about where wcfs is stuck and
      which processes have relation to it.
      
      /reviewed-by @levin.zimmermann
      /reviewed-on nexedi/wendelin.core!36
      cf1f16f6
    • Kirill Smelkov's avatar
      wcfs: Fix and enhance `wcfs status` to be reliable · e751b02c
      Kirill Smelkov authored
      Since wcfs beginning - since e3f2ee2d (wcfs: Initial implementation of
      basic filesystem) `wcfs status` was implemented as just join and
      reporting ok if that worked. That, however, turned out to be not robust
      because if wcfs is deadlocked, accessing any file on the filesystem,
      even a simple file as .wcfs/zurl might hang and so the status could
      hang as well. We see lots of such hung `wcfs status` processes on
      current deployment.
      
      More, it might be the case that wcfs is deadlocked in another way - e.g.
      on zheadMu, and then accessing .wcfs/zurl will work ok, but the system
      is not in a good shape while `wcfs status` missed to report that.
      
      -> Rework `wcfs status` completely to try accessing different files on
      the filesystem and doing so in cautious way so that if wcfs is in
      problematic state `wcfs status` won't get hung and will report the
      details about wcfs server and also about filesystem clients: which files
      are kept open, and what is in-kernel traceback of the server and the
      clients in case wcfs is hung.
      
      Please see comments in the added status function for details.
      
      An example of "good" status output when everything is ok:
      
          (neo) (z-dev) (g.env) kirr@deca:~/src/neo/src/lab.nexedi.com/nexedi/wendelin.core$ wcfs status  file://D/1.fs
          INFO 1204 17:04:40.154  325432 __init__.py:506] wcfs: status file://D/1.fs ...
          ok - mount entry: /dev/shm/wcfs/fccdb94842958d09c69261970b8037b0e5510fb8  (0:39)
          ok - wcfs server: pid325414 kirr wcfs
          ok - stat mountpoint
          ok - read .wcfs/zurl
          ok - read .wcfs/stats
      
      And example of "bad" status output when wcfs was simulated to be
      seen in deadlocked state by trying to read from .wcfs/debug/zhead
      instead of .wcfs/stats:
      
          root@deca:/home/kirr/src/neo/src/lab.nexedi.com/nexedi/wendelin.core# wcfs status file://D/1.fs
          INFO 1204 17:21:04.145  325658 __init__.py:506] wcfs: status file://D/1.fs ...
          ok - mount entry: /dev/shm/wcfs/fccdb94842958d09c69261970b8037b0e5510fb8  (0:39)
          ok - wcfs server: pid325414 kirr wcfs
          ok - stat mountpoint
          ok - read .wcfs/zurl
          fail - read .wcfs/stats: timed out (wcfs might be stuck)
      
          wcfs ktraceback:
          pid325414 kirr wcfs
          tid325414 kirr wcfs
          [<0>] fuse_dev_do_read+0xa29/0xa50 [fuse]
          [<0>] fuse_dev_read+0x79/0xb0 [fuse]
          [<0>] vfs_read+0x239/0x310
          [<0>] ksys_read+0x6b/0xf0
          [<0>] do_syscall_64+0x58/0xc0
          [<0>] entry_SYSCALL_64_after_hwframe+0x64/0xce
      
          tid325418 kirr wcfs
          [<0>] hrtimer_nanosleep+0xc7/0x1b0
          [<0>] __x64_sys_nanosleep+0xbe/0xf0
          [<0>] do_syscall_64+0x58/0xc0
          [<0>] entry_SYSCALL_64_after_hwframe+0x64/0xce
      
          tid325419 kirr wcfs
          [<0>] fuse_dev_do_read+0xa29/0xa50 [fuse]
          [<0>] fuse_dev_read+0x79/0xb0 [fuse]
          [<0>] vfs_read+0x239/0x310
          [<0>] ksys_read+0x6b/0xf0
          [<0>] do_syscall_64+0x58/0xc0
          [<0>] entry_SYSCALL_64_after_hwframe+0x64/0xce
      
          tid325420 kirr wcfs
          [<0>] futex_wait_queue+0x60/0x90
          [<0>] futex_wait+0x185/0x270
          [<0>] do_futex+0x106/0x1b0
          [<0>] __x64_sys_futex+0x8e/0x1d0
          [<0>] do_syscall_64+0x58/0xc0
          [<0>] entry_SYSCALL_64_after_hwframe+0x64/0xce
      
          tid325421 kirr wcfs
          [<0>] do_epoll_wait+0x698/0x7d0
          [<0>] do_compat_epoll_pwait.part.0+0xb/0x70
          [<0>] __x64_sys_epoll_pwait+0x91/0x140
          [<0>] do_syscall_64+0x58/0xc0
          [<0>] entry_SYSCALL_64_after_hwframe+0x64/0xce
      
          tid325422 kirr wcfs
          [<0>] futex_wait_queue+0x60/0x90
          [<0>] futex_wait+0x185/0x270
          [<0>] do_futex+0x106/0x1b0
          [<0>] __x64_sys_futex+0x8e/0x1d0
          [<0>] do_syscall_64+0x58/0xc0
          [<0>] entry_SYSCALL_64_after_hwframe+0x64/0xce
      
          tid325423 kirr wcfs
          [<0>] futex_wait_queue+0x60/0x90
          [<0>] futex_wait+0x185/0x270
          [<0>] do_futex+0x106/0x1b0
          [<0>] __x64_sys_futex+0x8e/0x1d0
          [<0>] do_syscall_64+0x58/0xc0
          [<0>] entry_SYSCALL_64_after_hwframe+0x64/0xce
      
          tid325426 kirr wcfs
          [<0>] fuse_dev_do_read+0xa29/0xa50 [fuse]
          [<0>] fuse_dev_read+0x79/0xb0 [fuse]
          [<0>] vfs_read+0x239/0x310
          [<0>] ksys_read+0x6b/0xf0
          [<0>] do_syscall_64+0x58/0xc0
          [<0>] entry_SYSCALL_64_after_hwframe+0x64/0xce
      
          tid325427 kirr wcfs
          [<0>] futex_wait_queue+0x60/0x90
          [<0>] futex_wait+0x185/0x270
          [<0>] do_futex+0x106/0x1b0
          [<0>] __x64_sys_futex+0x8e/0x1d0
          [<0>] do_syscall_64+0x58/0xc0
          [<0>] entry_SYSCALL_64_after_hwframe+0x64/0xce
      
          tid325428 kirr wcfs
          [<0>] fuse_dev_do_read+0xa29/0xa50 [fuse]
          [<0>] fuse_dev_read+0x79/0xb0 [fuse]
          [<0>] vfs_read+0x239/0x310
          [<0>] ksys_read+0x6b/0xf0
          [<0>] do_syscall_64+0x58/0xc0
          [<0>] entry_SYSCALL_64_after_hwframe+0x64/0xce
      
          tid325429 kirr wcfs
          [<0>] futex_wait_queue+0x60/0x90
          [<0>] futex_wait+0x185/0x270
          [<0>] do_futex+0x106/0x1b0
          [<0>] __x64_sys_futex+0x8e/0x1d0
          [<0>] do_syscall_64+0x58/0xc0
          [<0>] entry_SYSCALL_64_after_hwframe+0x64/0xce
      
          tid325646 kirr wcfs
          [<0>] fuse_dev_do_read+0xa29/0xa50 [fuse]
          [<0>] fuse_dev_read+0x79/0xb0 [fuse]
          [<0>] vfs_read+0x239/0x310
          [<0>] ksys_read+0x6b/0xf0
          [<0>] do_syscall_64+0x58/0xc0
          [<0>] entry_SYSCALL_64_after_hwframe+0x64/0xce
      
          wcfs clients:
            pid325430 kirr bash ('bash',)
                  cwd     -> /dev/shm/wcfs/fccdb94842958d09c69261970b8037b0e5510fb8
      
                  pid325430 kirr bash
                  tid325430 kirr bash
                  [<0>] do_select+0x661/0x830
                  [<0>] core_sys_select+0x1ba/0x3a0
                  [<0>] do_pselect.constprop.0+0xe9/0x180
                  [<0>] __x64_sys_pselect6+0x53/0x80
                  [<0>] do_syscall_64+0x58/0xc0
                  [<0>] entry_SYSCALL_64_after_hwframe+0x64/0xce
      
            pid325637 kirr ipython3 ('/usr/bin/python3', '/usr/bin/ipython3')
                  fd/12   -> /dev/shm/wcfs/fccdb94842958d09c69261970b8037b0e5510fb8/.wcfs/zurl
      
                  pid325637 kirr ipython3
                  tid325637 kirr ipython3
                  [<0>] do_epoll_wait+0x698/0x7d0
                  [<0>] __x64_sys_epoll_wait+0x6f/0x110
                  [<0>] do_syscall_64+0x58/0xc0
                  [<0>] entry_SYSCALL_64_after_hwframe+0x64/0xce
      
                  tid325638 kirr ipython3
                  [<0>] futex_wait_queue+0x60/0x90
                  [<0>] futex_wait+0x185/0x270
                  [<0>] do_futex+0x106/0x1b0
                  [<0>] __x64_sys_futex+0x8e/0x1d0
                  [<0>] do_syscall_64+0x58/0xc0
                  [<0>] entry_SYSCALL_64_after_hwframe+0x64/0xce
      
                  tid325640 kirr ipython3
                  [<0>] futex_wait_queue+0x60/0x90
                  [<0>] futex_wait+0x185/0x270
                  [<0>] do_futex+0x106/0x1b0
                  [<0>] __x64_sys_futex+0x8e/0x1d0
                  [<0>] do_syscall_64+0x58/0xc0
                  [<0>] entry_SYSCALL_64_after_hwframe+0x64/0xce
      
          Traceback (most recent call last):
            File "/home/kirr/src/wendelin/venv/z-dev/bin/wcfs", line 11, in <module>
              load_entry_point('wendelin.core', 'console_scripts', 'wcfs')()
            File "<decorator-gen-42>", line 2, in main
            File "/home/kirr/src/tools/go/pygolang/golang/__init__.py", line 165, in _goframe
              return f(*argv, **kw)
            File "/home/kirr/src/wendelin/wendelin.core/wcfs/__init__.py", line 997, in main
              status(zurl)
            File "/home/kirr/src/wendelin/wendelin.core/wcfs/__init__.py", line 592, in status
              verify("read .wcfs/stats", xos.readfile, "%s/.wcfs/debug/zhead" % mnt.point)
            File "<decorator-gen-43>", line 2, in verify
            File "/home/kirr/src/tools/go/pygolang/golang/__init__.py", line 165, in _goframe
              return f(*argv, **kw)
            File "/home/kirr/src/wendelin/wendelin.core/wcfs/__init__.py", line 570, in verify
              fail("%s: timed out (wcfs might be stuck)" % subj)
            File "/home/kirr/src/wendelin/wendelin.core/wcfs/__init__.py", line 512, in fail
              raise RuntimeError('(failed)')
          RuntimeError: (failed)
      
      TODO tests.
      
      /reviewed-by @levin.zimmermann
      /reviewed-on nexedi/wendelin.core!36
      e751b02c
    • Kirill Smelkov's avatar
      wcfs: Switch to custom lsof · ba1fecf9
      Kirill Smelkov authored
      Previously when error on umount we were invoking lsof(8) to show list of
      files that are still opened on the filesystem. But lsof(8) turned out to
      be unreliable because it stats the filesystem and if e.g. wcfs server
      process is stopped lsof only prints
      
          WARNING:wcfs:# lsof /dev/shm/wcfs/1439df02dfcc41ab9dfb68e7ac4ad615f3b7d46e
          WARNING:wcfs:lsof: status error on /dev/shm/wcfs/1439df02dfcc41ab9dfb68e7ac4ad615f3b7d46e: Transport endpoint is not connected
          ...
          WARNING:wcfs:(lsof failed)
      
      fuser(1) from psmisc works a bit better: it can show list of still
      opened files on the mounted tree even if filesystem server is crashed.
      
      However with some version of fuser I still saw "Transport endpoint is
      not connected" once, and in the next patches we will also need to
      inspect "using" processes more, so if we are to use fuser we will need
      to parse its output which might get fragile.
      
      -> Do our own lsof utility instead.
      
      We have all the infrastructure in place to do so in the form of MountDB
      and ProcDB, and as implemented Mount.lsof() emits Proc'esses which can
      be inspected further conveniently. For now we do not do such inspection,
      but for `wcfs status` and `wcfs stop` we will want to poke with kernel
      tracebacks of those processes.
      
      /reviewed-by @levin.zimmermann
      /reviewed-on !36
      ba1fecf9
    • Kirill Smelkov's avatar
      wcfs: os: Add ProcDB & co · 7932bac5
      Kirill Smelkov authored
      Add ProcDB that represents database of processes with code to query it
      in several ways. We will need this functionality for `wcfs status`,
      `wcfs stop` and probably for more.
      
      TODO tests for ProcDB & co.
      
      /reviewed-by @levin.zimmermann
      /reviewed-on !36
      7932bac5
    • Kirill Smelkov's avatar
      wcfs: Switch to work with mount entries instead of mountpoints · eef14a82
      Kirill Smelkov authored
      Because mount entries provide more information compared to just one
      string mountpoint. For example later for `wcfs status` and `wcfs stop`
      we will need to use the ID of a "device" that is attached to the mount,
      and also the type of the filesystem that is serving the mount.
      
      -> Introduce internal.os.MountDB to retrieve information from OS
         registry of mounted filesystems and use its entries instead of plain
         mountpoint string.
      
      wcfs_test.py already had some rudimentary code to parse /proc/mounts
      which we also replace with querying MountDB.
      
      The API of MountDB might be viewed as a bit of overkill but it will
      align with API of upcoming ProcDB for which it will be reasonable.
      
      TODO tests for MountDB & co.
      
      /reviewed-by @levin.zimmermann
      /reviewed-on nexedi/wendelin.core!36
      eef14a82