1. 17 Nov, 2020 2 commits
    • Kirill Smelkov's avatar
      bigfile/py: Garbage-collect BigFile <=> BigFileH cycles · a6a8f5ba
      Kirill Smelkov authored
      Since ZBigFile keeps references to fileh objects that are created
      through it it forms a file <=> fileh cycle that is not collected without
      cyclic GC:
      
      https://lab.nexedi.com/nexedi/wendelin.core/blob/v0.13-52-ga702d41/bigfile/file_zodb.py#L497
      https://lab.nexedi.com/nexedi/wendelin.core/blob/v0.13-52-ga702d41/bigfile/file_zodb.py#L566-571
      
      We did not noticed this leak until now because it is small, but with
      upcoming wendelin.core 2 it is important to release a fileh, because
      there is WCFS connection associated with fileh, and if fileh is not
      released, that connection also stays alive, keeping on-WCFS resources
      still being used, and preventing WCFS from being unmounted cleanly.
      
      -> Add cyclic GC support to PyBigFile / PyBigFileH
      
      NOTE: we still don't allow PyVMA <=> PyBigFileH cycles to be collected,
      because fileh_close called from fileh.__del__ asserts that there are no
      live mappings left. See added comments for details. There is no
      known practical need to use such cycles, so this should be ok.
      
      See also other patches on cyclic GC topic:
      
      - 450ad804 (bigarray: ArrayRef support for BigArray)  // adds cyclic GC support for PyVMA
      - d97641d2 (bigfile/py: Properly untrack PyVMA from GC before dealloc)
      
      /proposed-for-review-on !12
      a6a8f5ba
    • Kirill Smelkov's avatar
      bigfile/py: Move PyVMA's support for cyclic GC close to pyvma_dealloc · 7cc35422
      Kirill Smelkov authored
      The logic in pyvma_traverse and pyvma_clear needs to be synchronized
      with PyVMA deallocation. In the next patche we'll be amending this
      logic, and it will help a reader to keep all those functions together.
      
      For the reference: PyVMA support for cyclic GC was introduced in
      450ad804 (bigarray: ArrayRef support for BigArray). See also d97641d2
      (bigfile/py: Properly untrack PyVMA from GC before dealloc).
      
      /proposed-for-review-on !12
      7cc35422
  2. 17 Apr, 2020 1 commit
  3. 15 Apr, 2020 4 commits
  4. 14 Apr, 2020 4 commits
  5. 01 Apr, 2020 2 commits
  6. 18 Dec, 2019 7 commits
    • Kirill Smelkov's avatar
      bigfile/py: Move data structures to public .h file · 907bd9d4
      Kirill Smelkov authored
      This is needed so that e.g. a Python class implemented in C or Cython
      (cdef class) could inherit from PyBigFile.
      
      Don't put _bigfile.h into separate include/ directory, and keep it along
      main .c file, similarly to how pygolang is organized.
      907bd9d4
    • Kirill Smelkov's avatar
      bigfile/py: Provide package-level documentation · 3684d164
      Kirill Smelkov authored
      Provide package-level documentation that gives brief overview of what
      this package does. Split internal notes into separate comment.
      3684d164
    • Kirill Smelkov's avatar
      bigfile/py: Stop using Plan9 C extensions · bf44905b
      Kirill Smelkov authored
      Starting from 5755a6b3 (Basic setup.py / Makefile to build/install/sdist
      stuff + bigfile.so skeleton) and 35eb95c2 (bigfile: Python wrapper
      around virtual memory subsystem) we were using Plan9 C extensions[1] for
      simple inheritance. Those extensions are supported by GCC with
      -fplan9-extensions option. However that option is supported only for C,
      while for C++ it does not work at all with error produced by the compiler
      on Plan9 syntax.
      
      Soon we'll need to add another extension - written in C++ - to
      wendelin.core . This extension will be providing client side of WCFS and
      integrating that with virtmem. In that extension we'll need to use
      _bigfile data structures - in particular we'll need to use PyBigFile and
      extend it with another `cdef class` children written in Cython/C++.
      
      This patch prepares for that: first stop using Plan9 C extensions in
      _bigfile py module data structures and adapt the code correspondingly.
      In the next patch we'll move those data structures into an .h file.
      
      We don't drop -fplan9-extensions from setup.py, because Plan9-style
      inheritance continues to be used internally by virtmem - e.g. in
      ram_shmfs.c and friends.
      
      A bit pity to drop that good stuff, but given that we'll need to use C++
      for WCFS client for other good stuff provided by pygolang[2], it is a
      reasonable compromise.
      
      [1] http://9p.io/sys/doc/comp.html  "Extensions" section
      [2] https://pypi.org/project/pygolang
      bf44905b
    • Kirill Smelkov's avatar
      bigfile/zodb: Factor-out LivePersistent into -> lib/zodb · c02776e9
      Kirill Smelkov authored
      It was from long-ago marked as "XXX move to common place".
      c02776e9
    • Kirill Smelkov's avatar
      bigfile/zodb: FIXME invalidations are not working correctly on blocks topology change · 8c32c9f6
      Kirill Smelkov authored
      I noticed this while working on WCFS: if file blocks topology change,
      the invalidation process is not working correctly. It is also not
      correct with respect to live cache pressure.
      
      Add FIXME in the code and test for live cache pressure.
      
      kirr/wendelin.core@5a4562fc
      kirr/wendelin.core@48eb692f
      kirr/wendelin.core@d1a579b2
      kirr/wendelin.core@69c94fbc
      8c32c9f6
    • Kirill Smelkov's avatar
      bigfile/zodb: Explain why we always mark ZBlk object changed if block data change · d27ade8e
      Kirill Smelkov authored
      For ZBlk0 this is trivial, but for ZBlk1 it may seem that we could avoid
      changing ZBlk object itself and mark only pointed-to ZData object as
      changed. However that would be not correct to do if we consider
      invalidations.
      
      Noticed while working on WCFS.
      d27ade8e
    • Kirill Smelkov's avatar
      *: Add package-level documentation to ZODB-related packages · d5e0d2f9
      Kirill Smelkov authored
      Add package-level documentation to
      
        - bigfile/file_zodb.py,
        - bigarray/array_zodb.py, and
        - lib/zodb.py
      
      The most interesting read is file_zodb.py .
      
      Slightly improve documenation for functions in a couple of places.
      
      Improving documentation was long overdue and it is improved only slightly by this commit.
      d5e0d2f9
  7. 04 Dec, 2019 1 commit
  8. 15 Jul, 2019 1 commit
    • Kirill Smelkov's avatar
      bigfile/py: It is ok to have .fileh_open() as BigFile method · 7ee02038
      Kirill Smelkov authored
      There was an XXX of whether fileh_open should be a BigFile method or
      global function. However if it would be a global function it will need
      to anyway accept file parameter to indicate which file is opened, and
      that in turn suggests that it should be a file method. Remove XXX.
      7ee02038
  9. 12 Jul, 2019 3 commits
    • Kirill Smelkov's avatar
      */tests: Use defer instead of finally · 2b457640
      Kirill Smelkov authored
      try/finally was used in a couple of places to save/restore default ZBlk
      format setting. Move the restore part close to save with the help of
      defer.
      2b457640
    • Kirill Smelkov's avatar
      *: Use defer for dbclose & friends · 5c8340d2
      Kirill Smelkov authored
      For tests this makes sure that if one test fails, it won't make following
      tests fail just because the next test will fail trying to lock test database.
      
      For regular code (demo_zbigarray.py) this is also a good thing to do -
      to always close the database irregardless of whether an exception was
      raised before program reached end of main.
      
      Pygolang becomes regular - not test only - dependency. Being regular
      dependency is currently required only by demo_zbigarray.py, but it will
      be also used in upcoming wcfs, so adding pygolang into wendelin.core
      dependencies aligns with the plan.
      
      dbclose now uses defer almost everywhere - there are still few places in
      tests, where one test function is opening/closing test database multiple
      times - those were not (yet ?) converted.
      5c8340d2
    • Kirill Smelkov's avatar
      */tests: Use pytest.raises in modern way · b12e319e
      Kirill Smelkov authored
      Instead of
      
      	raises(Exception, 'code')
      
      do
      
      	with raises(Exception):
      		code
      
      This removes lots of warnings, similar to below example:
      
      	bigfile/tests/test_basic.py::test_basic
      	  /home/kirr/src/wendelin/wendelin.core/bigfile/tests/test_basic.py:79: PytestDeprecationWarning: raises(..., 'code(as_a_string)') is deprecated, use the context manager form or use `exec()` directly
      
      	  See https://docs.pytest.org/en/latest/deprecations.html#raises-warns-exec
      	    raises(ROAttributeError, "f.blksize = 1") # RO attribute
      b12e319e
  10. 11 Jul, 2019 2 commits
  11. 10 Jul, 2019 2 commits
  12. 09 Jul, 2019 1 commit
    • Kirill Smelkov's avatar
      bigfile: Unstub ram_close · d91ef8fb
      Kirill Smelkov authored
      It should release resources associated with RAM. Make it call .ram_close
      from RAM ops. Add corresponding .ram_close to ram_shmfs. This fixes
      SHMFS_RAM->prefix leak.
      d91ef8fb
  13. 08 Jul, 2019 3 commits
    • Kirill Smelkov's avatar
      bigfile/test/test_ram: Don't forget to free allocated Page structs · e8eca379
      Kirill Smelkov authored
      test_ram is low-level test that tests RAM pages allocation/mmapping.
      As allocated pages are not integrated with virtmem (not added to any
      file mapping and RAM->lru_list) the Page structs have to be explicitly
      freed. Fixes e.g.
      
      	Direct leak of 80 byte(s) in 1 object(s) allocated from:
      	    #0 0x7ff29af46518 in calloc (/usr/lib/x86_64-linux-gnu/libasan.so.5+0xe9518)
      	    #1 0x56131dc22289 in zalloc include/wendelin/utils.h:67
      	    #2 0x56131dc225d6 in ramh_alloc_page bigfile/tests/../ram.c:41
      	    #3 0x56131dc2a19e in main bigfile/tests/test_ram.c:130
      	    #4 0x7ff29ac9f09a in __libc_start_main ../csu/libc-start.c:308
      e8eca379
    • Kirill Smelkov's avatar
      bigfile: RAM must be explicitly free'ed after close · f688a31d
      Kirill Smelkov authored
      Else on-heap allocated RAM object is leaked. Fixes e.g. the following
      error on ASAN:
      
      	Direct leak of 56 byte(s) in 1 object(s) allocated from:
      	    #0 0x7fc9ef390518 in calloc (/usr/lib/x86_64-linux-gnu/libasan.so.5+0xe9518)
      	    #1 0x555ca792f309 in zalloc include/wendelin/utils.h:67
      	    #2 0x555ca7935f9a in ram_limited_new bigfile/tests/../../t/t_utils.c:35
      	    #3 0x555ca793a0ba in test_file_access_synthetic bigfile/tests/test_virtmem.c:292
      	    #4 0x555ca7967bc4 in main bigfile/tests/test_virtmem.c:1121
      	    #5 0x7fc9ef0e909a in __libc_start_main ../csu/libc-start.c:308
      f688a31d
    • Kirill Smelkov's avatar
      bigfile/test: Don't forget to close opened RAM · f1931264
      Kirill Smelkov authored
      Only one place that was using ram_new was missing to call ram_close in
      the end.
      f1931264
  14. 18 Jun, 2019 1 commit
    • Kirill Smelkov's avatar
      Fix build for Python 3.7 · bca5f79e
      Kirill Smelkov authored
      Starting from Python 3.7 the place to keep exception state was changed:
      https://github.com/python/cpython/commit/ae3087c638
      
      NOTE ZEO4 does not wok with Python3.7, because ZEO4 uses "async" for a
      variable and that breaks because starting from Python3.7 "async" became
      a keyword.
      
      After the fix wendelin.core tests pass under all python2.7, python3.6
      and python3.7.
      bca5f79e
  15. 23 May, 2019 1 commit
    • Kirill Smelkov's avatar
      bigfile/zodb: Resync _ZBigFileH to Connection.transaction_manager on every connection reopen · d9d6adec
      Kirill Smelkov authored
      This continues c7c01ce4 (bigfile/zodb: ZODB.Connection can migrate
      between threads on close/open and we have to care): Until now we were
      retrieving zconn.transaction_manager on _ZBigFileH init, and further using
      that transaction manager for every connection reopen. However that is
      not correct because on every reopen connection can be given new
      transaction manager.
      
      We were not practically hitting the bug because until recently ZODB was,
      by default, using the same ThreadTransactionManager manager instance as
      Connection.transaction_manager for all connections, and not doing all
      steps needed to keep _ZBigFileH.transaction_manager in sync to
      Connection was forgiven - a particular transaction manager that was used
      was TransactionManager instance implicitly associated with current
      thread by global threading.Local transaction.manager . However starting
      from ZODB 5.5.0 Connection code was changed to remember as
      .transaction_manager the particular TransactionManager instance without
      any threading.Local games:
      
          https://github.com/zopefoundation/ZODB/commit/b6ac40f153
          https://github.com/zopefoundation/ZODB/issues/208
          https://github.com/zopefoundation/ZODB/pull/226
      
      Given that we were not syncing properly that broke wendelin.core tests, for
      example:
      
          bigfile/tests/test_filezodb.py::test_bigfile_filezodb_vs_conn_migration Exception in thread Thread-1:
          Traceback (most recent call last):
            File "/usr/lib/python2.7/threading.py", line 801, in __bootstrap_inner
              self.run()
            File "/usr/lib/python2.7/threading.py", line 754, in run
              self.__target(*self.__args, **self.__kwargs)
            File "/home/kirr/src/wendelin/wendelin.core/bigfile/tests/test_filezodb.py", line 401, in T11
              transaction.commit()    # should be nothing
            File "/home/kirr/src/wendelin/venv/z-dev/local/lib/python2.7/site-packages/transaction/_manager.py", line 252, in commit
              return self.manager.commit()
            File "/home/kirr/src/wendelin/venv/z-dev/local/lib/python2.7/site-packages/transaction/_manager.py", line 131, in commit
              return self.get().commit()
            File "/home/kirr/src/wendelin/venv/z-dev/local/lib/python2.7/site-packages/transaction/_transaction.py", line 298, in commit
              self._synchronizers.map(lambda s: s.beforeCompletion(self))
            File "/home/kirr/src/wendelin/venv/z-dev/local/lib/python2.7/site-packages/transaction/weakset.py", line 61, in map
              f(elt)
            File "/home/kirr/src/wendelin/venv/z-dev/local/lib/python2.7/site-packages/transaction/_transaction.py", line 298, in <lambda>
              self._synchronizers.map(lambda s: s.beforeCompletion(self))
            File "/home/kirr/src/wendelin/wendelin.core/bigfile/file_zodb.py", line 671, in beforeCompletion
              assert txn is zconn.transaction_manager.get()
          AssertionError
      
      What is happening here is that one thread used the connection and
      ZBigFile/_ZBigFileH associated with it, then the connection was closed
      and released to DB pool. Then the connection was reopened but by another
      thread and thus with different TransactionManager instance and oops -
      _ZBigFileH.transaction_manager is different because it is
      TransactionManager instance that was used by the first thread.
      
      Fix it by resyncing _ZBigFileH.transaction_manager on every
      connection reopen. No new test as existing tests already cover the
      problem when run with ZODB >= 5.5.0 .
      d9d6adec
  16. 11 Jan, 2019 1 commit
    • Kirill Smelkov's avatar
      bigfile/py: Properly untrack PyVMA from GC before dealloc · d97641d2
      Kirill Smelkov authored
      On a testing instance we started to see segfaults in pyvma_dealloc()
      with inside calls to vma_unmap but with NULL pyvma->fileh. That was
      strange, becuse before calling vma_unmap(), the code explicitly checks
      whether pyvma->fileh is !NULL.
      
      That was, as it turned out, due to pyvma_dealloc being called twice at the
      same time from two python threads. Here is how that was possible:
      
      T1 decrefs pyvma and finds its reference count drops to zero. It calls
      pyvma_dealloc. From there vma_unmap() is called, which calls virt_lock()
      and that releases GIL first. Another thread T2 was waiting for GIL, it
      acquires it, does some work at python level and somehow triggers GC.
      Since PyVMA supports cyclic GC, it was on GC list and thus GC calls
      dealloc for the same vma again. Here is how it looks in the backtraces:
      
      T1:
      
      	#0  0x00007f6aefc57827 in futex_abstimed_wait_cancelable (private=0, abstime=0x0, expected=0, futex_word=0x1e011d0) at ../sysdeps/unix/sysv/linux/futex-internal.h:205
      	#1  do_futex_wait (sem=sem@entry=0x1e011d0, abstime=0x0) at sem_waitcommon.c:111
      	#2  0x00007f6aefc578d4 in __new_sem_wait_slow (sem=0x1e011d0, abstime=0x0) at sem_waitcommon.c:181
      	#3  0x00007f6aefc5797a in __new_sem_wait (sem=<optimized out>) at sem_wait.c:29
      	#4  0x00000000004ffbc4 in PyThread_acquire_lock ()
      	#5  0x00000000004dbe8a in PyEval_RestoreThread ()
      	#6  0x00007f6ac6d3b8fc in py_gil_retake_if_waslocked (arg=0x4f18f00) at bigfile/_bigfile.c:1048
      	#7  0x00007f6ac6d3dcfc in virt_gil_retake_if_waslocked (gilstate=0x4f18f00) at bigfile/virtmem.c:78
      	#8  0x00007f6ac6d3dd30 in virt_lock () at bigfile/virtmem.c:92
      	#9  0x00007f6ac6d3e724 in vma_unmap (vma=0x7f6a7e0c4100) at bigfile/virtmem.c:271
      	#10 0x00007f6ac6d3a0bc in pyvma_dealloc (pyvma0=0x7f6a7e0c40e0) at bigfile/_bigfile.c:284
      	...
      	#13 0x00000000004d76b0 in PyEval_EvalFrameEx ()
      
      T2:
      
      	#5  0x00007f6ac6d3a081 in pyvma_dealloc (pyvma0=0x7f6a7e0c40e0) at bigfile/_bigfile.c:276
      	#6  0x0000000000500450 in ?? ()
      	#7  0x00000000004ffd82 in _PyObject_GC_New ()
      	#8  0x0000000000485392 in PyList_New ()
      	#9  0x00000000004d3bff in PyEval_EvalFrameEx ()
      
      T2 does the work of vma_unmap and clears C-level vma. Then, when T1 wakes up and
      returns to vma_unmap, it sees vma->file and all other fields cleared -> oops
      segfault.
      
      Fix it by removing pyvma from GC list before going to do actual destruction.
      This way if a concurrent GC triggers, it won't see the vma object on its list,
      and thus won't have a chance to invoke its destructor the second time.
      
      The bug was introduced in 450ad804 (bigarray: ArrayRef support for BigArray)
      when PyVMA was changed to be cyclic-GC aware. However at that time, even Python
      documentation itself was not saying PyObject_GC_UnTrack is needed, as it was
      added only in 2.7.15 after finding that many types in CPython itself are
      vulnerable to similar segfaults:
      
      https://github.com/python/cpython/commit/4cde4bdcc86
      https://bugs.python.org/issue31095
      
      It is pity, that CPython took the approach to force all type authors to
      care to invoke PyObject_GC_UnTrack explicitly, instead of doing that
      automatically in Python runtime before calling tp_dealloc.
      
      /cc @Tyagov, @klaus
      /reviewed-on !11
      d97641d2
  17. 17 Apr, 2018 1 commit
    • Kirill Smelkov's avatar
      tests: Explicitly close ZODB connections for places with warnings found by previous patch · 01b995a4
      Kirill Smelkov authored
      bigfile/tests/test_filezodb.py ........W: testdb: teardown: <Connection at 7f8fe2b43b90> left not closed by test code; opened by:
        ...
        File "/home/kirr/src/wendelin/wendelin.core/bigfile/tests/test_filezodb.py", line 754, in test_bigfile_zblk1_zdata_reuse
          _test_bigfile_zblk1_zdata_reuse()
        File "/home/kirr/src/wendelin/wendelin.core/bigfile/tests/test_filezodb.py", line 759, in _test_bigfile_zblk1_zdata_reuse
          root = dbopen()
        File "/home/kirr/src/wendelin/wendelin.core/bigfile/tests/test_filezodb.py", line 47, in dbopen
          return testdb.dbopen()
        File "/home/kirr/src/wendelin/wendelin.core/lib/testing.py", line 188, in dbopen
          self.connv.append( (weakref.ref(conn), ''.join(traceback.format_stack())) )
      
      lib/tests/test_zodb.py .W: testdb: teardown: <Connection at 7f8fe26f13d0> left not closed by test code; opened by:
        ...
        File "/home/kirr/src/wendelin/wendelin.core/lib/tests/test_zodb.py", line 49, in test_deactivate_btree
          root = dbopen()
        File "/home/kirr/src/wendelin/wendelin.core/lib/tests/test_zodb.py", line 30, in dbopen
          return testdb.dbopen()
        File "/home/kirr/src/wendelin/wendelin.core/lib/testing.py", line 188, in dbopen
          self.connv.append( (weakref.ref(conn), ''.join(traceback.format_stack())) )
      01b995a4
  18. 02 Apr, 2018 1 commit
    • Kirill Smelkov's avatar
      bigarray: ArrayRef support for BigArray · 450ad804
      Kirill Smelkov authored
      Rationale
      ---------
      
      Array reference could be useful in situations where one needs to pass arrays
      between processes and instead of copying array data, leverage the fact that
      top-level array, for example ZBigArray, is already persisted separately, and
      only send small amount of information referencing data in question.
      
      Implementation
      --------------
      
      BigArray is not regular NumPy array and so needs explicit support in
      ArrayRef code to find root object and indices. This patch adds such
      support via the following way:
      
      - when BigArray.__getitem__ creates VMA, it remembers in the VMA
        the top-level BigArray object under which this VMA was created.
      
      - when ArrayRef is finding root, it can detect such VMAs, because it will
        be pointed to by the most top regular ndarray's .base, and in turn gets
        top-level BigArray object from the VMA.
      
      - further all indices computations are performed, similarly to complete regular
        ndarrays case, on ndarrays root and a. But in the end .lo and .hi are
        adjusted for the corresponding offset of where root is inside whole
        BigArray.
      
      - there is no need to adjust .deref() at all.
      
      For remembering information into a VMA and also to be able to get
      (readonly) its mapping addresses _bigfile.c extension has to be extended
      a bit. Since we are now storing arbitrary python object attached to
      PyVMA - it can create cycles - and so PyVMA accordingly adjusted to
      support cyclic garbage collector.
      
      Please see the patch itself for more details and comments.
      450ad804
  19. 31 Jan, 2018 1 commit
    • Kirill Smelkov's avatar
      bigfile/virtmem: Fix build with recent glibc · c3cc8a99
      Kirill Smelkov authored
      It was
      
      	bigfile/pagefault.c:45:36: warning: ‘struct ucontext’ declared inside parameter list will not be visible outside of this definition or declaration
      	 static int faulted_by(const struct ucontext *uc);
      	                                    ^~~~~~~~
      	bigfile/pagefault.c: In function ‘on_pagefault’:
      	bigfile/pagefault.c:59:24: warning: passing argument 1 of ‘faulted_by’ from incompatible pointer type [-Wincompatible-pointer-types]
      	     write = faulted_by(uc);
      	                        ^~
      	bigfile/pagefault.c:45:12: note: expected ‘const struct ucontext *’ but argument is of type ‘struct ucontext *’
      	 static int faulted_by(const struct ucontext *uc);
      	            ^~~~~~~~~~
      	bigfile/pagefault.c: At top level:
      	bigfile/pagefault.c:208:36: warning: ‘struct ucontext’ declared inside parameter list will not be visible outside of this definition or declaration
      	 static int faulted_by(const struct ucontext *uc)
      	                                    ^~~~~~~~
      	bigfile/pagefault.c:208:12: error: conflicting types for ‘faulted_by’
      	 static int faulted_by(const struct ucontext *uc)
      	            ^~~~~~~~~~
      	bigfile/pagefault.c:45:12: note: previous declaration of ‘faulted_by’ was here
      	 static int faulted_by(const struct ucontext *uc);
      	            ^~~~~~~~~~
      	bigfile/pagefault.c: In function ‘faulted_by’:
      	bigfile/pagefault.c:217:15: error: dereferencing pointer to incomplete type ‘const struct ucontext’
      	     write = uc->uc_mcontext.gregs[REG_ERR] & 0x2;
      	               ^~
      	bigfile/pagefault.c: At top level:
      	bigfile/pagefault.c:45:12: warning: ‘faulted_by’ used but never defined
      	 static int faulted_by(const struct ucontext *uc);
      	            ^~~~~~~~~~
      	bigfile/pagefault.c:208:12: warning: ‘faulted_by’ defined but not used [-Wunused-function]
      	 static int faulted_by(const struct ucontext *uc)
      	            ^~~~~~~~~~
      
      Change to using ucontext_t because apparently there is no `struct
      ucontext` anymore (and man for sigaction says 3rd parameter to hander is
      of type `ucontext_t *` - not `struct ucontext *` - cast to `void *`)
      
      Explicitly include <ucontext.h> because we are dereferencing ucontext_t,
      even though today it appears to be included by <signal.h>.
      c3cc8a99
  20. 12 Dec, 2017 1 commit
    • Kirill Smelkov's avatar
      virtmem: Benchmarks for pagefault handling · 3cfc2728
      Kirill Smelkov authored
      Benchmark the time it takes for virtmem to handle pagefault with noop
      loadblk for loadblk both implemented in C and in Python.
      
      On my computer it is:
      
      	name          µs/op
      	PagefaultC    269 ± 0%
      	pagefault_py  291 ± 0%
      
      Quite a big time in other words.
      
      It turned out to be mostly spent in fallocate'ing pages on tmpfs from
      /dev/shm. Part of the above 269 µs/op is taken by freeing (reclaiming)
      pages back when benchmarking work size exceed /dev/shm size, and part to
      allocating.
      
      If I limit the work size (via npage in benchmem.c) to be less than whole
      /dev/shm it starts to be ~ 170 µs/op and with additional tracing it
      shows as something like this:
      
          	.. on_pagefault_start   0.954 µs
          	.. vma_on_pagefault_pre 0.954 µs
          	.. ramh_alloc_page_pre  0.954 µs
          	.. ramh_alloc_page      169.992 µs
          	.. vma_on_pagefault     172.853 µs
          	.. vma_on_pagefault_pre 172.853 µs
          	.. vma_on_pagefault     174.046 µs
          	.. on_pagefault_end     174.046 µs
          	.. whole:               171.900 µs
      
      so almost all time is spent in ramh_alloc_page which is doing the fallocate:
      
      	https://lab.nexedi.com/nexedi/wendelin.core/blob/f11386a4/bigfile/ram_shmfs.c#L125
      
      Simple benchmark[1] confirmed it is indeed the case for fallocate(tmpfs) to be
      relatively slow[2] (and that for recent kernels it regressed somewhat
      compared to Linux 3.16). Profile flamegraph for that benchmark[3] shows
      internal loading of shmem_fallocate which for 1 hardware page is not
      that too slow (e.g. <1µs) but when a request comes for a region
      internally performs it page by page and so accumulates that ~ 170µs for 2M.
      
      I've tried to briefly rerun the benchmark with huge pages activated on /dev/shm via
      
      	mount /dev/shm -o huge=always,remount
      
      as both regular user and as root but it was executing several times
      slower. Probably something to investigate more later.
      
      [1] https://lab.nexedi.com/kirr/misc/blob/4f84a06e/tmpfs/t_fallocate.c
      [2] https://lab.nexedi.com/kirr/misc/blob/4f84a06e/tmpfs/1.txt
      [3] https://lab.nexedi.com/kirr/misc/raw/4f84a06e/tmpfs/fallocate-2M-nohuge.svg
      3cfc2728