- 14 Jul, 2016 1 commit
-
-
Kirill Smelkov authored
The following started to appear after recent gcc upgrade on my host: bigfile/virtmem.c: In function `vma_on_pagefault': bigfile/virtmem.c:696:9: warning: implicit declaration of function `usleep' [-Wimplicit-function-declaration] usleep(10000); // XXX with 1000 uslepp still busywaits
-
- 15 Dec, 2015 3 commits
-
-
Kirill Smelkov authored
loadblk() calls are potentially slow and external code that serve the cal can take other locks in addition to virtmem lock taken by virtmem subsystem. If that "other locks" are also taken before external code calls e.g. fileh_invalidate_page() in different codepath a deadlock can happen, e.g. T1 T2 page-access invalidation-from-server received V -> loadblk Z <- ClientStorage.invalidateTransaction() Z -> zeo.load V <- fileh_invalidate_page The solution to avoid deadlock is to call loadblk() with virtmem lock released and upon loadblk() completion recheck virtmem data structures carefully. To make that happen: - new page state is introduces: PAGE_LOADING (file content loading is in progress) - virtmem releases virt_lock before calling loadblk() when serving pagefault - because loading is now done with virtmem lock released, now: 1. After loading completes we need to recheck fileh/vma data structures The recheck is done in full - vma_on_pagefault() just asks its driver (see VM_RETRY and VM_HANDLED codes) to retry handling the fault completely. This should work as the freshly loaded page was just inserted into fileh->pagemap and should be found there in the cache on next lookup. On the other hand this also works correctly, if there was concurrent change - e.g. vma was unmapped while we were loading the data - in that case the fault will be also processed correctly - but loaded data will stay in fileh->pagemap (and if not used will be evicted as not-needed eventually by RAM reclaim). 2. Similar to retrying mechanism is used for cases when two threads concurrently access the same page and would both try to load corresponding block - only one thread issues the actual loadblk() and another waits for load to complete with polling and VM_RETRY. 3. To correctly invalidate loading-in-progress pages another new page state is introduced: PAGE_LOADING_INVALIDATED (file content loading was in progress while request to invalidate the page came in) which fileh_invalidate_page() uses to propagate invalidation message to loadblk() caller. 4. Blocks loading can now happen in parallel with other block loading and other virtmem operations - e.g. invalidation. For such cases tests are added to test_thread.py 5. virtmem lock now becomes just regular lock, instead of being previously recursive. For virtmem lock to be recursive was needed for cases, when code under loadblk() could trigger other virtmem calls, e.g. due to GC and calling another VMA dtor that would want to lock virtmem, but virtmem lock was already held. This is no longer needed. 6. To catch double faults we now cannot use just on static variable in_on_pagefault. That variable thus becomes thread-local. 7. Old test in test_thread to "test that access vs access don't overlap" no longer holds true - and is thus removed. /cc @Tyagov, @klaus
-
Kirill Smelkov authored
Previously we were doing virt_lock() / virt_unlock() which automatically were making sure to unlock GIL before locking virtmem, and to restore GIL state to previous after virtmem lock happened. virt_unlock() was unlocking just the virtmem lock without touching GIL at all - that works because the running code would eventually release GIL as python regularly does so to allowing multiple threads to run. In the next patch however, we'll need to wait for in-progress-loading page to complete, and that wait has to be done with GIL released (so other python threads could run), and for doing so we'll need functionality to make sure GIL is unlocked and retake it back, not tied to virt_lock(). So factor it out.
-
Kirill Smelkov authored
Both comments are from the beginning - from 9a293c2d (bigfile/virtmem: Userspace Virtual Memory Manager) - but d53271b9 patch (bigfile/virtmem: Big Virtmem lock) missed to update them.
-
- 17 Aug, 2015 1 commit
-
-
Kirill Smelkov authored
FileH is a handle representing snapshot of a file. If, for a pgoffset, fileh already has loaded page, but we know the content of the file has changed externally after loading has been done, we need to propagate to fileh that such-and-such page should be invalidated (and reloaded on next access). This patch introduces fileh_invalidate_page(fileh, pgoffset) to do just that. In the next patch we'll use this facility to propagate invalidations of ZBlk ZODB objects to virtmem subsystem. NOTE Since invalidation removes "dirtiness" from a page state, several subsequent invalidations can make a fileh completely non-dirty (invalidating all dirty page). Previously fileh->dirty was just a one bit, so we needed to improve how we track dirtiness. One way would be to have a dirty list for fileh pages and operate on that. This has advantage to even optimize dirty pages processing like fileh_dirty_writeout() where we currently scan through all fileh pages just to write only PAGE_DIRTY ones. Another simpler way is to make fileh->dirty a counter and maintain that. Since we are going to move virtmem subsystem back into the kernel, here, a simpler less-intrusive approach is used.
-
- 06 Aug, 2015 3 commits
-
-
Kirill Smelkov authored
At present several threads running can corrupt internal virtmem datastructures (e.g. ram->lru_list, fileh->pagemap, etc). This can happen even if we have zope instances only with 1 worker thread - because there are other "system" thread, and python garbage collection can trigger at any thread, so if a virtmem object, e.g. VMA or FileH was there sitting at GC queue to be collected, their collection, and thus e.g. vma_unmap() and fileh_close() will be called from different-from-worker thread. Because of that virtmem just has to be aware of threads not to allow internal datastructure corruption. On the other hand, the idea of introducing userspace virtual memory manager turned out to be not so good from performance and complexity point of view, and thus the plan is to try to move it back into the kernel. This way it does not make sense to do a well-optimised locking implementation for userspace version. So we do just a simple single "protect-all" big lock for virtmem. Of a particular note is interaction with Python's GIL - any long-lived lock has to be taken with GIL released, because else it can deadlock: t1 t2 G V G !G V G so we introduce helpers to make sure the GIL is not taken, and to retake it back if we were holding it initially. Those helpers (py_gil_ensure_unlocked / py_gil_retake_if_waslocked) are symmetrical opposites to what Python provides to make sure the GIL is locked (via PyGILState_Ensure / PyGILState_Release). Otherwise, the patch is more-or-less straightforward application for one-big-lock to protect everything idea.
-
Kirill Smelkov authored
We factored out SIGSEGV block/restore from fileh_dirty_writeout() to all functions in cb7a7055 (bigfile/virtmem: Block/restore SIGSEGV in non-pagefault-handling function). The restoration however just sets whole thread sigmask. It could be possible that between block/restore calls procmask for other signals could be changed, and this way - setting procmask directly - we will overwrite them. So be careful, and when restoring SIGSEGV mask, touch mask bit for only that signal. ( we need xsigismember helper to get this done, which is also introduced in this patch )
-
Kirill Smelkov authored
Non on-pagefault code should not access any not-mmapped memory. Here we just refactor the code we already had to block/restore SIGSEGV from fileh_dirty_writeout() and use it in all functions called from non-pagefaulting context, as promised. This way, if there is an error in virtmem implementation which incorrectly accesses prepared for BigFile maps memory, we'll just die with coredump instead of trying to incorrectly handle the pagefault.
-
- 03 Apr, 2015 2 commits
-
-
Kirill Smelkov authored
Does similar things to what kernel does - users can mmap file parts into address space and access them read/write. The manager will be getting invoked by hardware/OS kernel for cases when there is no page loaded for read, or when a previousle read-only page is being written to. Additionally to features provided in kernel, it support to be used to store back changes in transactional way (see fileh_dirty_writeout()) and potentially use huge pages for mappings (though this is currently TODO)
-
Kirill Smelkov authored
This will be the core of virtual memory subsystem. For now we just define a structure to describe pages of memory and add utility to allocate address space from OS.
-