Commits · 4da6a16ffb2d3db94ff421d76bdd3065def23fdf · Kirill Smelkov / wendelin.core

04 Dec, 2024 4 commits

Y wcfs: Switch to custom lsof · 4da6a16f

Kirill Smelkov authored Dec 04, 2024

Previously when error on umount we were invoking lsof(8) to show list of
files that are still opened on the filesystem. But lsof(8) turned out to
be unreliable because it stats the filesystem and if e.g. wcfs server
process is stopped lsof only prints

    WARNING:wcfs:# lsof /dev/shm/wcfs/1439df02dfcc41ab9dfb68e7ac4ad615f3b7d46e
    WARNING:wcfs:lsof: status error on /dev/shm/wcfs/1439df02dfcc41ab9dfb68e7ac4ad615f3b7d46e: Transport endpoint is not connected
    ...
    WARNING:wcfs:(lsof failed)

fuser(1) from psmisc works a bit better: it can show list of still
opened files on the mounted tree even if filesystem server is crashed.

However with some version of fuser I still saw "Transport endpoint is
not connected" once, and in the next patches we will also need to
inspect "using" processes more, so if we are to use fuser we will need
to parse its output which might get fragile.

-> Do our own lsof utility instead.

We have all the infrastructure in place to do so in the form of MountDB
and ProcDB, and as implemented Mount.lsof() emits Proc'esses which can
be inspected further conveniently. For now we do not do such inspection,
but for `wcfs status` and `wcfs stop` we will want to poke with kernel
tracebacks of those processes.

4da6a16f

Y wcfs: os: Add ProcDB & co · a50e2ce9

Kirill Smelkov authored Dec 04, 2024

Add ProcDB that represents database of processes with code to query it
in several ways. We will need this functionality for `wcfs status`,
`wcfs stop` and probably for more.

a50e2ce9

Y wcfs: Switch to work with mount entries instead of mountpoints · 1335412f

Kirill Smelkov authored Dec 04, 2024

Because mount entries provide more information compared to just one
string mountpoint. For example later for `wcfs status` and `wcfs stop`
we will need to use the ID of a "device" that is attached to the mount,
and also the type of the filesystem that is serving the mount.

-> Introduce internal.os.MountDB to retrieve information from OS
   registry of mounted filesystems and use its entries instead of plain
   mountpoint string.

wcfs_test.py already had some rudimentary code to parse /proc/mounts
which we also replace with querying MountDB.

The API of MountDB might be viewed as a bit of overkill but it will
align with API of upcoming ProcDB for which it will be reasonable.

1335412f

Y: wcfs: Factor wcfs_test.{readfile,writefile} into wcfs.internal.os · f02bad18
Kirill Smelkov authored Dec 04, 2024
```
We will soon need to use this functions from several places.
Plain code movement.
```
f02bad18

29 Nov, 2024 2 commits

Y wcfs: Make .wcfs/zhead to be debug-only feature and move it to .wcfs/debug/zhead · ae2d0f53

Kirill Smelkov authored Nov 29, 2024

Currently it is easy to deadlock wcfs by opening .wcfs/zhead and not
reading from opened zhead handle due to the nature of what this zhead
file is designed to do. For this reason we want .wcfs/zhead not to be present
on production deployments and be available only where it is needed -
during test runs.

-> Introduce -debug flag for that to "enable debugging features" and put
   .wcfs/zhead to be that debug-only feature, and move it to be
   .wcfs/debug/zhead for clarity.

ae2d0f53

Y wcfs: Rename -d to -trace.fuse · d4dd89c9

Kirill Smelkov authored Nov 29, 2024

Three reasons:

1) -d is not very descriptive, it's help "debug" is also too vague;
2) currently it only activates trace of FUSE exchange
3) in the next patch we are going to introduce -debug that will activate
   debugging-only features which will remain disabled on production runs
   for safety.

-> Rename -d to -trace.fuse to clarify its meaning and prepare for
   upcoming -debug.

d4dd89c9

26 Nov, 2024 5 commits

XY wcfs: multiprocessing: Teach join to return what invoked function returned · 0c412a5d

Kirill Smelkov authored Nov 26, 2024

For upcoming `wcfs status` we will need to be able to invoke functions
from stdlib and similar and retrieve their returned values.

XXX not sure the API is right. Maybe join should just join and return
value retrieved via .retvalue() method.

0c412a5d

Y wcfs: multiprocessing: Allow SubProcess to be used with plain functions directly · f4cf5aff

Kirill Smelkov authored Nov 26, 2024

Sometimes it is useful to run just f(*argv, **kw) instead of f(cin,
cout, *argv, **kw). For example for upcoming `wcfs status` we will need
to run os.stat(".wcfs/zurl") and similar things. Add low-level
flag-parameter to omit passing cin and cout to the called function.

f4cf5aff

Y wcfs: Factor wcfs_faultyprot_test.tSubProcess into wcfs.internal.SubProcess · cd2ee503
Kirill Smelkov authored Nov 26, 2024
```
We will soon need to use this utility not only in tests.
Practically plain code movement with minimal adaptation.
```
cd2ee503

Y wcfs: tests: Teach tSubProcess to run function from any module · 989c7105

Kirill Smelkov authored Nov 26, 2024

We will soon need to use tSubProcess not only for tests and so will move
it out to another place. It will need to be able to run functions from
both wcfs/wcfs_faultyprot_test.py and from other modules. Prepare to
that.

989c7105

Y wcfs: tests: Fix double close of pipes in tSubProcess · 5571e203

Kirill Smelkov authored Nov 26, 2024

Running tests in wcfs/wcfs_faultyprot_test.py was yielding errors about
invalid closes, e.g.

    wcfs/wcfs_faultyprot_test.py::test_wcfs_pinhfaulty_kill_on_watch[_bad_watch_no_pin_read]
    ------------------------------------ live log call -------------------------------------
    M: commit -> @at0 (03fcee2b22c56155)
    INFO     wcfs:__init__.py:293 starting for file:///tmp/testdb_fs.v5XBEf/1.fs ...
    ...
    INFO     wcfs:__init__.py:400 unmount/stop wcfs pid875517 @ /dev/shm/wcfs/1060296857b6e9e0134ffb708787deb4039140ae
    I1126 19:11:13.746387  875517 wcfs.go:3100] stop "/dev/shm/wcfs/1060296857b6e9e0134ffb708787deb4039140ae" "file:///tmp/testdb_fs.v5XBEf/1.fs"
    close failed in file object destructor:             <-- NOTE
    IOError: [Errno 9] Bad file descriptor              <-- NOTE
    PASSED

Since soon we are going to use tSubProcess not only in tests this will
become less innocent with more incentive to fix.

Looking at the strace output I saw that the same file descriptor was
being closed twice in a row. Looking at strace further revealed that
this fd was created via pip syscall and finally I could connect it to
the pipe created by subprocess.Popen.

So what happens here is that we were taking fd number from
subprocess.Popen and passing it to multiprocessing.Connection with both
thinking they own the fd, and closing it on GC.

-> Fix that by really passing the fd to MPConnection as a copy first, and
   then redirecting Popen fds to devnull - to make sure in robust way
   that stdin/out IO does not interfere with pickle exchange going on in
   between child and parent via multiprocessing connections.

Fixes 33ea7769 (wcfs: tests: Move client to be pinkill'ed into separate process)

5571e203

18 Nov, 2024 1 commit

fixup! X wcfs: py: log with data and time present · 7cc71a91

Kirill Smelkov authored Nov 18, 2024

We can include at least milliseconds portion of the time via %(msecs) .

Noticed in Levin's tactivity.py .

7cc71a91

16 Nov, 2024 5 commits
- X (adjust tests after cd6d0b71 ) · e17110b2
  Kirill Smelkov authored Nov 17, 2024
  
  e17110b2
- X wcfs: Switch sticky=true everywhere · 3a0ecd39
  Kirill Smelkov authored Nov 17, 2024
```
ref:Zzj45iDT1E4PakvG@deca.navytux.spb.ru
```
  3a0ecd39
- X wcfs: py: also try fuser in addition to lsof · 5abd20f2
  Kirill Smelkov authored Nov 17, 2024
  
  5abd20f2
- X wcfs: py: log with data and time present · 830cd5f2
  Kirill Smelkov authored Nov 17, 2024
```
So that we can correlage py log with go log.

Tried to mimic glog format output, but it is not exactly the same currently
```
  830cd5f2
- X wcfs: Do not fail if to-be-killed process exited by itself and was waited by its parent · 135a548f
  Kirill Smelkov authored Nov 16, 2024
  
  135a548f
01 Nov, 2024 1 commit
- X disable OS cache retrieve for now as well; Add tracing of substeps in handleδZ · cd6d0b71
  Kirill Smelkov authored Nov 01, 2024
  
  cd6d0b71
29 Oct, 2024 1 commit
- wcfs: applying Kirills suggestions for improved debug log · f2ddf36f
  Levin Zimmermann authored Oct 29, 2024
  
  f2ddf36f
27 Oct, 2024 2 commits
- . · af3c514e
  Levin Zimmermann authored Oct 27, 2024
  
  af3c514e
- . · b15ad994
  Levin Zimmermann authored Oct 27, 2024
  
  b15ad994
25 Oct, 2024 1 commit
- . · 9157390d
  Klaus Wölfel authored Oct 25, 2024
  
  9157390d
07 Oct, 2024 1 commit

wcfs/fuse: Allow other users to access file system · d3ae0c6c

Levin Zimmermann authored Oct 07, 2024

On Theia all Slappartitions have the same user, however on bare-metal
SlapOS each partition has a different user. In this case zope nodes are
not allowed to read WCFS, because they belong to a different user (even
when every user read permission). This is a feature of FUSE and can be
disabled with 'allow_other'. This is unconditionally done with this
patch to allow deployment of WCFS on bare-metal SlapOS.

d3ae0c6c

19 Sep, 2024 2 commits

wendelin.core v2.0.alpha4 · db6fea3d
Levin Zimmermann authored Sep 19, 2024
```
/reviewed-by @kirr
/reviewed-on nexedi/wendelin.core!32
```
db6fea3d

wcfs: v↑ NEO/go dependency · ed6a71c1

Levin Zimmermann authored Sep 19, 2024

This patch updates NEO/go to include a patch that fixes
non-deterministic crashs for 'wendelin.core'. Its tradeoff is however a
moderate memory leak. This leak is going to be fixed later when a new
golang version is released. See more details about this here:
neo@ee23551d

/reviewed-by @kirr
/reviewed-on nexedi/wendelin.core!32

ed6a71c1

18 Sep, 2024 1 commit

wcfs: Clarify error context when WatchLink.sendReq is waiting for reply · 39d53cbb

Kirill Smelkov authored Sep 18, 2024

sendReq has two phases: a) send request, and b) read reply. When there
is an error on the first phase, e.g. client does not read what wcfs is
trying to send, it returns an error like

    pin #2 @03fb63abd6d65b33: sendReq: send .2: context deadline exceeded

however when there is an error on the second phase, e.g. client does not
reply to wcfs request, it currently returns an error like

    pin #2 @03fb63abd6d65b33: sendReq: context deadline exceeded

which is not clear to interpret about which part was problematic.

After this patch the error for the second case becomes

    pin #2 @03fb63abd6d65b33: sendReq: waiting for reply: context deadline exceeded

which is easier to interpret.

/reviewed-by @levin.zimmermann
/reviewed-on nexedi/wendelin.core!31

39d53cbb

17 Sep, 2024 14 commits

wcfs: v↑ go dependencies · 764d4da8

Levin Zimmermann authored Sep 17, 2024

This patch updates:

- github.com/golang/glog: we already wanted to do so in
    nexedi/wendelin.core!23,
    but we deferred it to keep go 1.18 support. However in recent patches
    we already dropped go 1.18 support and we can therefore update glog now.
- lab.nexedi.com/kirr/neo/go: add fix in handshake, see here for more information:
    kirr/neo@d75f4ac2 and
    kirr/neo@03db1d8a

This patch doesn't update:

- github.com/hanwen/go-fuse: This was updated upstream and Kirill already
    reviewed and integrated patches in custom branch. However when updating
    go-fuse to v2.4.3-0.20240904154523-9546fc238dc6 (this is
    kirr/go-fuse@9546fc23),
    WCFS tests fail on my machine [1] => let's defer update
- github.com/kisielk/og-rek: there are new patches that will be needed
    in the future, but we didn't update NEO/go og-rek dependency yet,
    so let's defer the update in wendelin.core until we updated og-rek
    in NEO/go
- github.com/johncgriffin/overflow: no update on upstream
- github.com/pkg/errors: no update on upstream
- github.com/stretchr/testify: This was already updated with
    nexedi/wendelin.core@c559ec1a
    'testify' seems to have a major release in the future which may break
    some of our test code, but for now major version 1 is still the
    stable release.

----
kirr: I confirm that
kirr/go-fuse@9546fc23 brings in
regression to WCFS tests. It seems I missed some error in that go-fuse
update and it will need to be bisected and debugged.

---

[1] Test failure log:

========================================== FAILURES ==========================================
______________________________________ test_wcfs_basic _______________________________________

    @func
    def test_wcfs_basic():
        t = tDB(); zf = t.zfile
        defer(t.close)

        # >>> lookup non-BigFile -> must be rejected
        with raises(OSError) as exc:
            t.wc._stat("head/bigfile/%s" % h(t.nonzfile._p_oid))
        assert exc.value.errno == EINVAL

        # >>> file initially empty
        f = t.open(zf)
        f.assertCache([])
        f.assertData ([], mtime=t.at0)

        # >>> (@at1) commit data -> we can see it on wcfs
        at1 = t.commit(zf, {2:'c1'})

        f.assertCache([0,0,0])  # initially not cached
        f.assertData (['','','c1'], mtime=t.head)

        # >>> (@at2) commit again -> we can see both latest and snapshotted states
        # NOTE blocks e(4) and f(5) will be accessed only in the end
        at2 = t.commit(zf, {2:'c2', 3:'d2', 5:'f2'})

        # f @head
>       f.assertCache([1,1,0,0,0,0])

wcfs/wcfs_test.py:1341:
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _

t = <wcfs.wcfs_test.tFile instance at 0x7ff61457b960>, incorev = [1, 1, 0, 0, 0, 0]

    def assertCache(t, incorev):
>       assert t.cached() == incorev
E       assert [0, 0, 0, 0, 0, 0] == [1, 1, 0, 0, 0, 0]
E         At index 0 diff: 0 != 1
E         Use -v to get the full diff

wcfs/wcfs_test.py:791: AssertionError
------------------------------------ Captured stdout call ------------------------------------

M: commit -> @at0 (03fb5dfbe3c1cd55)

M: commit -> @at1 (03fb5dfbe4936a66)
M:      f<0000000000000002>     [2]

M: commit -> @at2 (03fb5dfbe4d01166)
M:      f<0000000000000002>     [2, 3, 5]
>>> Change history by file:

f<0000000000000002>:
                                0 1 2 3 4 5 6 7
                                a b c d e f g h
        @at0 (03fb5dfbe3c1cd55)
        @at1 (03fb5dfbe4936a66)     2
        @at2 (03fb5dfbe4d01166)     2 3   5

------------------------------------ Captured stderr call ------------------------------------
I0917 12:43:53.392222  124283 wcfs.go:2752] start "/dev/shm/wcfs/0ca22ca24e4cff2d01c10aa546fe5d5ac64bce72" "file:///tmp/testdb_fs.z5ZoMH/1.fs"
I0917 12:43:53.392282  124283 wcfs.go:2758] (built with go1.21.13)
W0917 12:43:53.392404  124283 storage.go:232] zodb: FIXME: open file:///tmp/testdb_fs.z5ZoMH/1.fs: raw cache is not ready for invalidations -> NoCache forced
W0917 12:43:53.567807  124283 wcfs.go:2331] /head/bigfile: lookup "0000000000000001": bigfopen 0000000000000001 @03fb5dfbe3c1cd55: invalid argument: ZODB.Broken("persistent.Persistent") is not a ZBigFile
I0917 12:43:53.710208  124283 wcfs.go:2933] stop "/dev/shm/wcfs/0ca22ca24e4cff2d01c10aa546fe5d5ac64bce72" "file:///tmp/testdb_fs.z5ZoMH/1.fs"
------------------------------------- Captured log call --------------------------------------
WARNING  ZODB.FileStorage:FileStorage.py:412 Ignoring index for /tmp/testdb_fs.z5ZoMH/1.fs
_________________________________ test_wcfs_watch_vs_access __________________________________

    @func
    def test_wcfs_watch_vs_access():
        t = tDB(); zf = t.zfile; at0=t.at0
        defer(t.close)

        f = t.open(zf)
        at1 = t.commit(zf, {2:'c1'})
        at2 = t.commit(zf, {2:'c2', 3:'d2', 5:'f2'})
        at3 = t.commit(zf, {0:'a3', 2:'c3', 5:'f3'})

        f.assertData(['a3','','c3','d2','x','x'])
        f.assertCache([1,1,1,1,0,0])

        # watched + commit -> read -> receive pin messages.
        # read vs pin ordering is checked by assertBlk.
        #
        # f(5) is kept not accessed to check later how wcfs.go handles δFtail
        # rebuild after it sees not yet accessed ZBlk that has change history.
        wl3  = t.openwatch();  w3 = wl3.watch(zf, at3);  assert at3 == t.head
        assert w3.at     == at3
        assert w3.pinned == {}

        wl3_ = t.openwatch();  w3_ = wl3_.watch(zf, at3)
        assert w3_.at     == at3
        assert w3_.pinned == {}

        wl2  = t.openwatch();  w2 = wl2.watch(zf, at2)
        assert w2.at     == at2
        assert w2.pinned == {0:at0, 2:at2}

        # w_assertPin asserts on state of .pinned for {w3,w3_,w2}
        def w_assertPin(pinw3, pinw3_, pinw2):
            assert w3.pinned   == pinw3
            assert w3_.pinned  == pinw3_
            assert w2.pinned   == pinw2

        f.assertCache([1,1,1,1,0,0])
        at4 = t.commit(zf, {1:'b4', 2:'c4', 5:'f4', 6:'g4'})
>       f.assertCache([1,0,0,1,0,0,0])

wcfs/wcfs_test.py:1702:
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _

t = <wcfs.wcfs_test.tFile instance at 0x7ff614512050>, incorev = [1, 0, 0, 1, 0, 0, ...]

    def assertCache(t, incorev):
>       assert t.cached() == incorev
E       assert [0, 0, 0, 0, 0, 0, ...] == [1, 0, 0, 1, 0, 0, ...]
E         At index 0 diff: 0 != 1
E         Use -v to get the full diff

wcfs/wcfs_test.py:791: AssertionError
------------------------------------ Captured stdout call ------------------------------------

M: commit -> @at0 (03fb5dfc0fd82300)

M: commit -> @at1 (03fb5dfc10b92ecc)
M:      f<0000000000000049>     [2]

M: commit -> @at2 (03fb5dfc10cee9dd)
M:      f<0000000000000049>     [2, 3, 5]

M: commit -> @at3 (03fb5dfc1100c999)
M:      f<0000000000000049>     [0, 2, 5]

C: setup watch f<0000000000000049> @at3 (03fb5dfc1100c999)

C: setup watch f<0000000000000049> @at3 (03fb5dfc1100c999)

C: setup watch f<0000000000000049> @at2 (03fb5dfc10cee9dd)

M: commit -> @at4 (03fb5dfc120ed611)
M:      f<0000000000000049>     [1, 2, 5, 6]
>>> Change history by file:

f<0000000000000049>:
                                0 1 2 3 4 5 6 7
                                a b c d e f g h
        @at0 (03fb5dfc0fd82300)
        @at1 (03fb5dfc10b92ecc)     2
        @at2 (03fb5dfc10cee9dd)     2 3   5
        @at3 (03fb5dfc1100c999) 0   2     5
        @at4 (03fb5dfc120ed611)   1 2     5 6

------------------------------------ Captured stderr call ------------------------------------
I0917 12:44:03.733037  125217 wcfs.go:2752] start "/dev/shm/wcfs/0ca22ca24e4cff2d01c10aa546fe5d5ac64bce72" "file:///tmp/testdb_fs.z5ZoMH/1.fs"
I0917 12:44:03.733126  125217 wcfs.go:2758] (built with go1.21.13)
W0917 12:44:03.733418  125217 storage.go:232] zodb: FIXME: open file:///tmp/testdb_fs.z5ZoMH/1.fs: raw cache is not ready for invalidations -> NoCache forced
I0917 12:44:04.475273  125217 wcfs.go:2933] stop "/dev/shm/wcfs/0ca22ca24e4cff2d01c10aa546fe5d5ac64bce72" "file:///tmp/testdb_fs.z5ZoMH/1.fs"
============================ 2 failed, 42 passed in 55.81 seconds ============================
I0917 12:44:17.882140  125540 wcfs.go:2933] stop "/dev/shm/wcfs/c4d833a0bdea4c51decf5425b8ad2cc4d017280f" "file:///tmp/testdb_fs.bvHBy9/1.fs"
make: *** [Makefile:174: test.wcfs] Error 1

/reviewed-by @kirr
/reviewed-on nexedi/wendelin.core!30

764d4da8

wcfs: Implement protection against faulty client + related fixes and improvements · 89d653c0

Kirill Smelkov authored Sep 17, 2024

The WCFS documentation specifies [1]:

- - - 8> - - - 8> - - -

If a client, on purpose or due to a bug or being stopped, is slow to respond
with ack to file invalidation notification, it creates a problem because the
server will become blocked waiting for pin acknowledgments, and thus all
other clients, that try to work with the same file, will get stuck.

[...]

Lacking OS primitives to change address space of another process and not
being able to work it around with ptrace in userspace, wcfs takes approach
to kill a slow client on 30 seconds timeout by default.

- - - <8 - - - <8 - - -

But before, this protection wasn't implemented yet: one
faulty client could therefore freeze the whole system. With this work
this protection is implemented now: faulty clients are killed after the
timeout or any other misbehaviour in their pin handlers.

Working on this topic also resulted in several fixes and improvements
around isolation protocol implementation on the server side.

See individual patches for details.

[1] https://lab.nexedi.com/nexedi/wendelin.core/blob/38dde766/wcfs/wcfs.go#L186-208Co-authored-by: Levin Zimmermann <levin.zimmermann@nexedi.com>

/reviewed-on nexedi/wendelin.core!18

89d653c0

wcfs: Require Go 1.19 + go mod tidy · 1fcef9c9

Levin Zimmermann authored Sep 16, 2024

I would only suggest one very tiny change. In go.mod we have:

    module lab.nexedi.com/nexedi/wendelin.core/wcfs

    go 1.14

I think this needs to be updated to go 1.19 due to atomic.Int64.

And maybe we just need general go mod tidy update.

/reviewed-by @kirr
/reviewed-on nexedi/wendelin.core!18

1fcef9c9

wcfs: Implement protection against faulty client · c559ec1a

Kirill Smelkov authored Sep 16, 2024

The WCFS documentation specifies [1]:

- - - 8> - - - 8> - - -

If a client, on purpose or due to a bug or being stopped, is slow to respond
with ack to file invalidation notification, it creates a problem because the
server will become blocked waiting for pin acknowledgments, and thus all
other clients, that try to work with the same file, will get stuck.

[...]

Lacking OS primitives to change address space of another process and not
being able to work it around with ptrace in userspace, wcfs takes approach
to kill a slow client on 30 seconds timeout by default.

- - - <8 - - - <8 - - -

But before this patch, this protection wasn't implemented yet: one
faulty client could therefore freeze the whole system. With this patch
this protection is implemented now: faulty clients are killed after the
timeout or any other misbehaviour in their pin handlers.

[1] https://lab.nexedi.com/nexedi/wendelin.core/blob/38dde766/wcfs/wcfs.go#L186-208

Preliminary history:

    levin.zimmermann/wendelin.core@24904e82
    levin.zimmermann/wendelin.core@b02dcadcCo-authored-by: Levin Zimmermann <levin.zimmermann@nexedi.com>

/discussed-on nexedi/wendelin.core!18

c559ec1a

wcfs: Shutdown WatchLink on any pin error · 007d53db

Kirill Smelkov authored Sep 16, 2024

If a pin misbehaves or there is IO error or anything else, we want to
stop all communication on the watchlink, cancel on in-flight pin
handlers, and (TODO) kill the client with SIGBUS.

This patch organizes WatchLink shutdown on any pin error.
This functionality is indirectly tested by test_Wcfs_watch_robust and
will be also indirectly tested by faultyprotection tests.

It would be good to have dedicated tests probably, but that is,
hopefully, TODO.

/reviewed-by @levin.zimmermann
/reviewed-on nexedi/wendelin.core!18

007d53db

wcfs: Do not cancel pin handler by a READ interrupt · 7a79bdd6

Kirill Smelkov authored Sep 16, 2024

Pinning is critical operation whose failure will soon lead to client
being killed with SIGBUS. WCFS correctness also depend fundamentally on
pin operation, if started, to be handled by the client.

-> rework the READ handler not to cancel pin if a READ interrupt comes
   in from the OS client.

Do this via organizing WatchLink.serveCtx and running pins under this
context instead of under READ context. Later we will adjust pins to also
cancel this context on any error.

Test is, hopefully, TODO.

/reviewed-by @levin.zimmermann
/reviewed-on nexedi/wendelin.core!18

7a79bdd6

wcfs: Fix potential stuck in WatchLink.serve exit codepath · a6dd7806

Kirill Smelkov authored Sep 16, 2024

When serve is completing and going to exit, it sends an error message to
the client without any timeout. So if the client is not reading from the
channel, wcfs will get stuck waiting for the message to be consumed.

-> Fix that by trying to send that last error only during 1 second and
   ignoring errors if any

Test is, hopefully, TODO.

/reviewed-by @levin.zimmermann
/reviewed-on nexedi/wendelin.core!18

a6dd7806

wcfs: Rework WatchLink.serve exit codepath for better clarity · 08b011f5

Kirill Smelkov authored Sep 16, 2024

Bring in more structure:

- final watchlink cleanup is done in its own block
- cancelling spawned handlers is done in another block
- add more comments explaining things

/reviewed-by @levin.zimmermann
/reviewed-on nexedi/wendelin.core!18

08b011f5

wcfs: Rework WatchLink.serve to rely on context cancellation to stop reading · c7c3b82a

Kirill Smelkov authored Sep 15, 2024

Previously we were using .sk.CloseRead() to interrupt sk.Read(), but
that is not necessary since .sk, relying on xio.Pipe, implements
xio.Reader natively with full support for cancellation.

The original code to cancel via CloseRead comes from mid 2019 and predates

go123@7ad867a3
go123@0e368363
go123@0bdac628
go123@9db4dfac
go123@d2dc6c09

And in b17aeb8c and
6f0cdaff (wcfs: Provide isolation to clients), it seems, I missed to
update WatchLink.serve code to that.

Do that now because it simplifies code flow organization a bit.

/reviewed-by @levin.zimmermann
/reviewed-on nexedi/wendelin.core!18

c7c3b82a

wcfs: tests: Extend faulty protection tests with more kinds of faulty clients · c91fb14e

Kirill Smelkov authored Sep 16, 2024

So far we were testing only against faulty client that reads pin
notification ok, but does not reply to the notification. But there could
be more problems:

1) a client does not read pin notification at all
2) a client closes watchlink abruptly after reading pin notification
3) a client replies to pin notification but the reply is not "ack"

The first problem, if not handled leads to whole set of clients to
become stuck on reading the same block as the faulty client. The other
problems also indicate breakage of the isolation protocol from the client
side and that wcfs can no longer be sure that it provides good
uncorrupted data to the client.

In the first case, similarly to "no reply" situation we need to kill the
client to make progress while maintaining safety as well. In the cases 2
and 3 we cannot maintain safety if the faulty client remains in the set
of live and served clients, so it is also logical to send SIGBUS/SIGKILL
to it.

Killing a client with SIGBUS is similar to how OS kernel sends SIGBUS when
a memory-mapped file is accessed and loading file data results in EIO. It is
also similar to wendelin.core 1 where SIGBUS is raised if loading file block
results in an error.

Extend tests to cover all explained scenarios.

/reviewed-by @levin.zimmermann
/reviewed-on nexedi/wendelin.core!18

c91fb14e

wcfs: tests: Add test to exercies faulty client that does not reply to pin... · 0c35ae45

Kirill Smelkov authored Sep 16, 2024

wcfs: tests: Add test to exercies faulty client that does not reply to pin triggered by readPinWatchers

Levin writes:

    This patch extends the test scope of 'test_wcfs_pintimeout_kill'. Before
    this patch, the test only ensured that a client that does not
    respond to pin requests during the initial watch request [1] is
    killed. Now it also tests that a faulty client is killed when a block
    is invalidated. Since there are no other situations where the WCFS
    server sends pin requests to a client, the tests now cover all situations
    where a faulty client might not respond. This patch therefore aims to
    increase the security that WCFS is not blocked by a faulty client.

    [1] See nexedi/wendelin.core!18

Preliminary history:

    9d42efffCo-authored-by: Levin Zimmermann <levin.zimmermann@nexedi.com>

/discussed-on nexedi/wendelin.core!18

0c35ae45

wcfs: tests: Factor assertion that process should be killed into assertKilled · 008211fb

Kirill Smelkov authored Sep 16, 2024

We will need to use this utilitin from several places in the next patch.

/reviewed-by @levin.zimmermann
/reviewed-on nexedi/wendelin.core!18

008211fb

wcfs: tests: Allow to adjust tDB.assertBlk timeout · 82aa5949

Kirill Smelkov authored Sep 16, 2024

Currently assertBlk uses default timeout() to wait for READ operation to
complete. That works well everywhere except that in faulty
protection tests wcfs server will first need to wait for its own
pintimeout time to kill the faulty client and only then	return read
result to all non-faulty clients.

This way corresponding test, when one client fails to handle pin
notification well triggered due to READ operations, will need to use
adjusted longer timeout for the good client when doing assertBlk.

Adjust assertBlk to allow specifying custom timeout as preparatory step
for that.

/reviewed-by @levin.zimmermann
/reviewed-on nexedi/wendelin.core!18

82aa5949

wcfs: tests: Extend faultyprotect test with good client · 001e2e7e

Kirill Smelkov authored Sep 16, 2024

And make sure that that good client can setup its watch ok even
through there simultaneously is a faulty client that should get killed.

/reviewed-by @levin.zimmermann
/reviewed-on nexedi/wendelin.core!18

001e2e7e