Commit 9aa6a5d7 authored by Kirill Smelkov's avatar Kirill Smelkov

bigfile/py: Teach loadblk() to automatically break reference cycles to pybuf

Because otherwise we bug on pybuf->ob_refcnt != 1.

Such cycles might happen if inside loadblk implementation an exception
is internally raised and then caught even in deeply internal function
which does not receive pybuf as argument or by some other way:

After

	_, _, exc_traceback = sys.exc_info()

there is a reference loop created:

	exc_traceback
	  |        ^
	  |        |
	  v     .f_localsplus
	 frame

and since exc_traceback object holds reference to deepest frame, which via f_back
will be holding reference to frames up to frame with pybuf argument, it
will result in additional reference to pybuf being held until the above
cycle is garbage collected.

So to solve the problem while leaving loadblk, if pybuf->ob_refcnt !=
let's first do garbage-collection, and only then recheck left
references. After GC reference-loops created by exceptions should go
away.

NOTE PyGC_Collect() (C way to call gc.collect()) always performs
    GC - it is not affected by gc.disable() which disables only
    _automatic_ garbage collection.

NOTE it turned out out storeblk logic to unpin pybuf (see
    6da5172e "bigfile/py: Teach storeblk() how to correctly propagate
    traceback on error") is flawed, because when e.g. creating memoryview
    from pybuf internal pointer is copied and then clearing original buf
    does not result in clearing the copy.

NOTE it is ok to do gc.collect() from under sighandler - at least we are
    already doing it for a long time via running non-trivial python code
    which for sure triggers automatic GC from time to time (see also
    786d418d "bigfile: Simple test that we can handle GC from-under
    sighandler" for the reference)

Fixes: nexedi/wendelin.core#7
parent b4c269eb
...@@ -527,8 +527,39 @@ out: ...@@ -527,8 +527,39 @@ out:
XPyErr_FullClear(); XPyErr_FullClear();
/* verify pybuf is not held - its memory will go away right after return */ /* verify pybuf is not held - its memory will go away right after return */
if (pybuf) if (pybuf) {
/* pybuf might be held e.g. due to an exception raised & caught
* somewhere in loadblk implementation - so loadblk returns ok, but if
*
* _, _, exc_traceback = sys.exc_info()
*
* was also used inside the following reference loop is created:
*
* exc_traceback
* | ^
* | |
* v .f_localsplus
* frame
*
* and some of the frames continue to hold pybuf reference.
*
* Do full GC to collect such cycles this way removing references to
* pybuf.
*/
if (pybuf->ob_refcnt != 1) {
PyGC_Collect();
/* garbage collection could result in running arbitraty code
* because of finalizers. Print problems (if any) and make sure
* once again exception state is clear */
if (PyErr_Occurred())
PyErr_PrintEx(0);
XPyErr_FullClear();
}
/* now it is real bug if pybuf remains referenced from somewhere */
BUG_ON(pybuf->ob_refcnt != 1); BUG_ON(pybuf->ob_refcnt != 1);
}
/* drop pybuf /* drop pybuf
* *
...@@ -614,6 +645,14 @@ static int pybigfile_storeblk(BigFile *file, blk_t blk, const void *buf) ...@@ -614,6 +645,14 @@ static int pybigfile_storeblk(BigFile *file, blk_t blk, const void *buf)
/* we need to know only whether storeret != NULL, decref it now */ /* we need to know only whether storeret != NULL, decref it now */
Py_XDECREF(storeret); Py_XDECREF(storeret);
/* FIXME the following is not strictly correct e.g. for:
* mbuf = memoryview(buf)
* because mbuf.ptr will be a copy of buf.ptr and clearing buf does not
* clear mbuf.
*
* -> TODO rework to call gc.collect() if pybuf->ob_refcnt != 1
* like loadblk codepath does. */
/* repoint pybuf to empty region - the original memory attached to it can /* repoint pybuf to empty region - the original memory attached to it can
* go away right after we return (if e.g. dirty page was not mapped in any * go away right after we return (if e.g. dirty page was not mapped in any
* vma), but we need pybuf to stay not corrupt - for printing full * vma), but we need pybuf to stay not corrupt - for printing full
......
...@@ -151,14 +151,9 @@ def test_pagefault_savestate(): ...@@ -151,14 +151,9 @@ def test_pagefault_savestate():
# v .f_localsplus # v .f_localsplus
# frame # frame
# #
# Since upon returning we can't hold a reference to buf, let's # which result in holding additional ref to buf, but loadblk caller
# break the loop explicitly. # will detect and handle this situation via garbage-collecting
# # above cycle.
# Otherwise both exc_traceback and frame will be alive until next
# gc.collect() which cannot be perform in pagefault handler.
#
# Not breaking this loop will BUG with `buf.refcnt != 1` on return
del exc_traceback
self.loadblk_run = 1 self.loadblk_run = 1
......
Markdown is supported
0%
or
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment