- 17 Aug, 2015 1 commit
-
-
Kirill Smelkov authored
FileH is a handle representing snapshot of a file. If, for a pgoffset, fileh already has loaded page, but we know the content of the file has changed externally after loading has been done, we need to propagate to fileh that such-and-such page should be invalidated (and reloaded on next access). This patch introduces fileh_invalidate_page(fileh, pgoffset) to do just that. In the next patch we'll use this facility to propagate invalidations of ZBlk ZODB objects to virtmem subsystem. NOTE Since invalidation removes "dirtiness" from a page state, several subsequent invalidations can make a fileh completely non-dirty (invalidating all dirty page). Previously fileh->dirty was just a one bit, so we needed to improve how we track dirtiness. One way would be to have a dirty list for fileh pages and operate on that. This has advantage to even optimize dirty pages processing like fileh_dirty_writeout() where we currently scan through all fileh pages just to write only PAGE_DIRTY ones. Another simpler way is to make fileh->dirty a counter and maintain that. Since we are going to move virtmem subsystem back into the kernel, here, a simpler less-intrusive approach is used.
-
- 09 Aug, 2015 1 commit
-
-
Kirill Smelkov authored
Previously we were limited to printing traceback starting down from just storeblk() via explicit PyErr_PrintEx() - because pybuf was attached to memory which could go away right after return from C function - so we had to destroy that object for sure, not letting any traceback to hold a reference to it. This turned out to be too limiting and not showing full context where errors happen. So do the following trick: before returning, reattach pybuf to empty region at NULL, and this way we don't need to worry about pybuf pointing to memory which can go away -> thus instead of printing exception locally - just return it the usual way it is done with C api in Python. NOTE In contrast to PyMemoryViewObject, PyBufferObject definition is not public, so to support Python2 - had to copy its definition to PY2 compat header. NOTE2 loadblk() is not touched - the loading is done from sighandler context, which simulates as if it work in separate python thread, so it is leaved as is for now.
-
- 06 Aug, 2015 4 commits
-
-
Kirill Smelkov authored
At present several threads running can corrupt internal virtmem datastructures (e.g. ram->lru_list, fileh->pagemap, etc). This can happen even if we have zope instances only with 1 worker thread - because there are other "system" thread, and python garbage collection can trigger at any thread, so if a virtmem object, e.g. VMA or FileH was there sitting at GC queue to be collected, their collection, and thus e.g. vma_unmap() and fileh_close() will be called from different-from-worker thread. Because of that virtmem just has to be aware of threads not to allow internal datastructure corruption. On the other hand, the idea of introducing userspace virtual memory manager turned out to be not so good from performance and complexity point of view, and thus the plan is to try to move it back into the kernel. This way it does not make sense to do a well-optimised locking implementation for userspace version. So we do just a simple single "protect-all" big lock for virtmem. Of a particular note is interaction with Python's GIL - any long-lived lock has to be taken with GIL released, because else it can deadlock: t1 t2 G V G !G V G so we introduce helpers to make sure the GIL is not taken, and to retake it back if we were holding it initially. Those helpers (py_gil_ensure_unlocked / py_gil_retake_if_waslocked) are symmetrical opposites to what Python provides to make sure the GIL is locked (via PyGILState_Ensure / PyGILState_Release). Otherwise, the patch is more-or-less straightforward application for one-big-lock to protect everything idea.
-
Kirill Smelkov authored
Mutex lock/unlock should not fail if mutex was correctly initialized/used.
-
Kirill Smelkov authored
We factored out SIGSEGV block/restore from fileh_dirty_writeout() to all functions in cb7a7055 (bigfile/virtmem: Block/restore SIGSEGV in non-pagefault-handling function). The restoration however just sets whole thread sigmask. It could be possible that between block/restore calls procmask for other signals could be changed, and this way - setting procmask directly - we will overwrite them. So be careful, and when restoring SIGSEGV mask, touch mask bit for only that signal. ( we need xsigismember helper to get this done, which is also introduced in this patch )
-
Kirill Smelkov authored
We'll need this for function which return error not in errno - e.g. pthread_sigmask().
-
- 03 Apr, 2015 10 commits
-
-
Kirill Smelkov authored
Exposes BigFile - this way users can define BigFile backend in Python. Also exposed are BigFile handles, and VMA objects which are results of mmaping.
-
Kirill Smelkov authored
Does similar things to what kernel does - users can mmap file parts into address space and access them read/write. The manager will be getting invoked by hardware/OS kernel for cases when there is no page loaded for read, or when a previousle read-only page is being written to. Additionally to features provided in kernel, it support to be used to store back changes in transactional way (see fileh_dirty_writeout()) and potentially use huge pages for mappings (though this is currently TODO)
-
Kirill Smelkov authored
Users can inherit from BigFile and provide custom ->loadblk() and ->storeblk() to load/store file blocks from a database or some other storage. The system then could use such files to memory map them into user address space (see next patch).
-
Kirill Smelkov authored
This thing allows to get aliasable RAM from OS kernel and to manage it. Currently we get memory from a tmpfs mount, and hugetlbfs should also work, but is TODO because hugetlbfs in the kernel needs to be improved. We need aliasing because we'll need to be able to memory map the same page into several places in address space, e.g. for taking two slices overlapping slice of the same array at different times. Comes with test programs that show we aliasing does not work for anonymous memory.
-
Kirill Smelkov authored
This will be the core of virtual memory subsystem. For now we just define a structure to describe pages of memory and add utility to allocate address space from OS.
-
Kirill Smelkov authored
-
Kirill Smelkov authored
For BigFiles we'll needs to maintain `{} offset-in-file -> void *` mapping. A hash or a binary tree could be used there, but since we know files are most of the time accessed sequentially and locally in pages-batches, we can also organize the mapping in batches of keys. Specifically offset bits are so divided into parts, that every part addresses 1 entry in a table of hardware-page in size. To get to the actual value, the system lookups first table by first part of offset, then from first table and next part from address - second table, etc. To clients this looks like a dictionary with get/set/del & clear methods, but lookups are O(1) time always, and in contrast to hashes values are stored with locality (= adjacent lookups almost always access the same tables).
-
Kirill Smelkov authored
-
Kirill Smelkov authored
Like taking an exact integer log2, upcasting pointers for C-style inheritance done in a Plan9 way, and wrappers to functions which should never fail.
-
Kirill Smelkov authored
Modelled by ones used in Linux kernel.
-