wendelin.core can kill a zope process
I was able to kill a zope process by doing following at same time
Append to ZBigArray
Read from ZBigArray
Following traceback was show in stderr:
python2.7: bigfile/bigfile.c:549: pybigfile_loadblk: Assertion `!(pybuf->obrefcnt != 1)' failed.
Thanks for reporting. Here is where it happens, for the reference:
We have been seening this bug happenning seldomly from time to time, but could not get it reproduced. So if you probably have a simple reproducing program, please post it here.
So about #1:
new_one_x = zarray.shape + ndarray.shape
new_one_y = zarray.shape
self.log('New shape : %s, %s' %(new_one_x, new_one_y))
zarray[-ndarray.shape:] = ndarray
rng = np.random.RandomState(42)
zarray = self.getArray()
clf = IsolationForest(max_samples=10000, random_state=rng)
y_pred_train = clf.predict(zarray[1000:])
I tried one more time and I can reproduce it.
Update: Here is what is probably happenning:
- when python implementation of loadblk(..., buf) is invoked by bigfile.c it does some work
- while doing this work there can be exceptions raised and caught
- for an exception raised and caught, with traceback retrieved via
sys.exc_info(), it creates a reference cycle in between traceback an frame holding buf: https://lab.nexedi.com/nexedi/wendelin.core/blob/e73e22ea/bigfile/tests/test_basic.py#L146
- this happens even if
sys.exc_info()was called below function working with buf - e.g. in zzz() called by BadFile.loadblk() here: kirr/wendelin.core@10420e82
- such cycles stay alive until next garbage collection is run
Possible solutions can be:
- "unpin" buf to point to NULL/empty similar to what we already do for
- not point pybuf to destination memory initially and after loading to intermediate buffer do one more copy to destination
All solutions have their drawbacks. I'm taking some time to contemplate how better to go.