• Kirill Smelkov's avatar
    runUnitTest: Close ZODB DB before shutdown · 171a66f2
    Kirill Smelkov authored
    We are already closing Storage on shutdown (see nearby Storage.close call in
    the patch), but the DB handle was not closed. With classic ZODB it does not
    really matter in practice, because not closing DB is only leaking RAM resources
    and the program is anyway terminated soon.
    
    However with wendelin.core 2 things are different: in addition to ZODB storage
    server, there is also synthetic WCFS filesystem from which files are opened
    and memory-mmaped. In runUnitTest we start both ZODB and WCFS servers and we
    also shut down them both in the end. The filesystem server can be cleanly
    unmounted and shutdown only when there are no opened files left on it.
    
    Wendelin.core 2 client works by complementing each ZODB connection (zconn) with
    WCFS-level connection (wconn) to WCFS server. Those two zconn and wconn are
    kept in sync by wendelin.core 2 client logic: whenever zconn adjusts its view
    of the database, so does wconn. And whenever zconn is garbage collected, so
    does wconn is closed to release resources and close corresponding files opened
    on WCFS. In addition to garbage-collection, wconn is also closed when zconn.db
    - the ZODB DB handle via which zconn was created - is closed. This is needed to
    be able to reliably trigger freeing WCFS resources, because even after DB is
    closed, zconn can still stay alive forever being referenced from some python
    object - e.g. a frame or traceback or something else.
    
    The latter scenario actually happens during runUnitTest run. As the result it
    leads to inability to unmount and stop WCFS server cleanly:
    
        $ ./runUnitTest --load --save --with_wendelin_core -v erp5_core_test:testWendelinCore
        ...
        test (erp5.component.test.erp5_version.testWendelinCore.TestWendelinCoreBasic) ... ok
    
        ----------------------------------------------------------------------
        Ran 1 test in 0.105s
    
        OK
        F0618 19:05:46.359140   35468 wcfs/client/wcfs.cpp:486] CRITICAL: pinner: wcfs /dev/shm/wcfs/b53b61099c740b452b383db6df6dce4ad6d23ba2: wlink25: recvReq: link is down
        F0618 19:05:46.359173   35468 wcfs/client/wcfs.cpp:487] CRITICAL: wcfs server will likely kill us soon.
        CRITICAL: pinner: wcfs /dev/shm/wcfs/b53b61099c740b452b383db6df6dce4ad6d23ba2: wlink25: recvReq: link is down
        CRITICAL: wcfs server will likely kill us soon.
        Traceback (most recent call last):
          File ".../bin/runUnitTest", line 312, in <module>
            sys.exit(runUnitTest.main())
          File ".../parts/erp5/Products/ERP5Type/tests/runUnitTest.py", line 926, in main
            run_only=run_only,
          File ".../parts/erp5/Products/ERP5Type/tests/runUnitTest.py", line 709, in runUnitTestList
            wcfs_server.stop()
          ...
          File ".../parts/wendelin.core/wcfs/__init__.py", line 543, in _fuse_unmount
            raise RuntimeError("%s\n(more details logged)" % emsg)
        RuntimeError: fuse_unmount /dev/shm/wcfs/b53b61099c740b452b383db6df6dce4ad6d23ba2: failed: fusermount: failed to unmount /dev/shm/wcfs/b53b61099c740b452b383db6df6dce4ad6d23ba2: Device or resource busy
        (more details logged)
    
        # logs
        2021-06-18 19:05:45.978 INFO root wcfs: unmount/stop wcfs pid32981 @ /dev/shm/wcfs/b53b61099c740b452b383db6df6dce4ad6d23ba2
        2021-06-18 19:05:46.068 WARNING root fuse_unmount /dev/shm/wcfs/b53b61099c740b452b383db6df6dce4ad6d23ba2: failed: fusermount: failed to unmount /dev/shm/wcfs/b53b61099c740b452b383db6df6dce4ad6d23ba2: Device or resource busy
        2021-06-18 19:05:46.068 WARNING root # lsof /dev/shm/wcfs/b53b61099c740b452b383db6df6dce4ad6d23ba2
        2021-06-18 19:05:46.357 WARNING root COMMAND     PID       USER   FD   TYPE DEVICE SIZE/OFF NODE NAME
        runUnitTe 32175 slapuser34   24r   REG   0,48      111    4 /dev/shm/wcfs/b53b61099c740b452b383db6df6dce4ad6d23ba2/.wcfs/zurl
        runUnitTe 32175 slapuser34   25u   REG   0,48        0    7 /dev/shm/wcfs/b53b61099c740b452b383db6df6dce4ad6d23ba2/head/watch
        runUnitTe 32175 slapuser34   26r   REG   0,48  2097152    9 /dev/shm/wcfs/b53b61099c740b452b383db6df6dce4ad6d23ba2/head/bigfile/00000000000078b4
    
        2021-06-18 19:05:46.358 WARNING root -> kill -TERM wcfs.go ...
        2021-06-18 19:05:46.358 WARNING root -> abort FUSE connection ...
    
    I've debugged things a bit and even with
    kirr/ZODB@bbd03b3a ZODB connection stays alive
    being referenced from some frame objects.
    
    -> Fix this problem by explicitly closing ZODB DB nearby tests shutdown before call to wcfs_server.stop
    171a66f2
runUnitTest.py 37.7 KB