- 15 Apr, 2020 2 commits
-
-
Kirill Smelkov authored
Wendelin.core 2 will need to hook into when client ZODB.Connection changes its database view and readjust WCFS-level client connection accordingly. ZODB.Connection can change its view on either connection reopen, or even without reopen on start of new transaction. This patch implements ZODB.Connection.onResyncCallback for ZODB5 only. ZODB4 and ZODB3 support is TODO.
-
Kirill Smelkov authored
For wendelin.core v2 we need a way to know at which particular database state application-level ZODB connection is viewing the database. Knowing that state, WCFS client library will interact with WCFS filesystem server and, in simple terms, request the server to provide data as of that particular database state. Contrary to ZODB/go[1] ZODB/py does not provide the functionality to obtain DB state of connection view, so we have to build it ourselves. Let us call the function that for a client ZODB connection returns database state corresponding to its database view as zconn_at. It is relatively easy to implement zconn_at for ZODB5, since ZODB5 adopted MVCC uniformly and this patch does just that. However even with ZODB5 currently all released ZODB5 versions have race in Connection.open() vs invalidations[2], and so the first ZODB5 release with which zconn_at implemented here will work reliable should be upcoming ZODB 5.5.2 It is TODO to implement zconn_at for ZODB4 and ZODB3, which organize things differently. Please note what would happen if zconn_at gives, even a bit, incorrect answer: wcfs client will ask wcfs server to provide array data as of different database state compared to current on-client ZODB connection. This will result in that data accessed via ZBigArray will _not_ correspond to all other data accessed via regular ZODB mechanism. It is, in other words, would be a data corruptions. [1] https://godoc.org/lab.nexedi.com/kirr/neo/go/zodb#Connection [2] https://github.com/zopefoundation/ZODB/issues/290
-
- 01 Apr, 2020 1 commit
-
-
Kirill Smelkov authored
-
- 18 Dec, 2019 1 commit
-
-
Kirill Smelkov authored
It was from long-ago marked as "XXX move to common place".
-
- 12 Jul, 2019 1 commit
-
-
Kirill Smelkov authored
For tests this makes sure that if one test fails, it won't make following tests fail just because the next test will fail trying to lock test database. For regular code (demo_zbigarray.py) this is also a good thing to do - to always close the database irregardless of whether an exception was raised before program reached end of main. Pygolang becomes regular - not test only - dependency. Being regular dependency is currently required only by demo_zbigarray.py, but it will be also used in upcoming wcfs, so adding pygolang into wendelin.core dependencies aligns with the plan. dbclose now uses defer almost everywhere - there are still few places in tests, where one test function is opening/closing test database multiple times - those were not (yet ?) converted.
-
- 29 Oct, 2018 1 commit
-
-
Kirill Smelkov authored
Structured creates view of the array interpreting its minor axis as fully covered by a dtype. It is similar to arr.view(dtype) + corresponding reshape, but does not have limitations of ndarray.view(). For example: In [1]: a = np.arange(3*3, dtype=np.int32).reshape((3,3)) In [2]: a Out[2]: array([[0, 1, 2], [3, 4, 5], [6, 7, 8]], dtype=int32) In [3]: b = a[:2,:2] In [4]: b Out[4]: array([[0, 1], [3, 4]], dtype=int32) In [5]: dtxy = np.dtype([('x', np.int32), ('y', np.int32)]) In [6]: dtxy Out[6]: dtype([('x', '<i4'), ('y', '<i4')]) In [7]: b.view(dtxy) --------------------------------------------------------------------------- ValueError Traceback (most recent call last) <ipython-input-66-af98529aa150> in <module>() ----> 1 b.view(dtxy) ValueError: To change to a dtype of a different size, the array must be C-contiguous In [8]: structured(b, dtxy) Out[8]: array([(0, 1), (3, 4)], dtype=[('x', '<i4'), ('y', '<i4')]) Structured always creates view and never copies data. Here is original context where separately playing with .shape and .dtype was not enough, since it was creating array copy and OOM'ing the machine: klaus/wendelin@cbe4938b
-
- 17 Apr, 2018 1 commit
-
-
Kirill Smelkov authored
bigfile/tests/test_filezodb.py ........W: testdb: teardown: <Connection at 7f8fe2b43b90> left not closed by test code; opened by: ... File "/home/kirr/src/wendelin/wendelin.core/bigfile/tests/test_filezodb.py", line 754, in test_bigfile_zblk1_zdata_reuse _test_bigfile_zblk1_zdata_reuse() File "/home/kirr/src/wendelin/wendelin.core/bigfile/tests/test_filezodb.py", line 759, in _test_bigfile_zblk1_zdata_reuse root = dbopen() File "/home/kirr/src/wendelin/wendelin.core/bigfile/tests/test_filezodb.py", line 47, in dbopen return testdb.dbopen() File "/home/kirr/src/wendelin/wendelin.core/lib/testing.py", line 188, in dbopen self.connv.append( (weakref.ref(conn), ''.join(traceback.format_stack())) ) lib/tests/test_zodb.py .W: testdb: teardown: <Connection at 7f8fe26f13d0> left not closed by test code; opened by: ... File "/home/kirr/src/wendelin/wendelin.core/lib/tests/test_zodb.py", line 49, in test_deactivate_btree root = dbopen() File "/home/kirr/src/wendelin/wendelin.core/lib/tests/test_zodb.py", line 30, in dbopen return testdb.dbopen() File "/home/kirr/src/wendelin/wendelin.core/lib/testing.py", line 188, in dbopen self.connv.append( (weakref.ref(conn), ''.join(traceback.format_stack())) )
-
- 24 Oct, 2017 1 commit
-
-
Kirill Smelkov authored
Relicense to GPLv3+ with wide exception for all Free Software / Open Source projects + Business options. Nexedi stack is licensed under Free Software licenses with various exceptions that cover three business cases: - Free Software - Proprietary Software - Rebranding As long as one intends to develop Free Software based on Nexedi stack, no license cost is involved. Developing proprietary software based on Nexedi stack may require a proprietary exception license. Rebranding Nexedi stack is prohibited unless rebranding license is acquired. Through this licensing approach, Nexedi expects to encourage Free Software development without restrictions and at the same time create a framework for proprietary software to contribute to the long term sustainability of the Nexedi stack. Please see https://www.nexedi.com/licensing for details, rationale and options.
-
- 14 Aug, 2016 1 commit
-
-
Kirill Smelkov authored
13c0c17c (bigfile/zodb: Format #1 which is optimized for small changes) used BTree to organize ZBlk1 block's chunks and for loadblkdata() added "TODO we are missing to free internal BTree structures on data load". #3 besides other things showed that even when we deactivate ZData objects, we are still keeping them as ghosts occupying memory and the same for IOBucket objects. This all happens because there is no proper way to deactivate whole btree - including internal buckets objects. And since internal buckets are not deactivated, they stay in picklecache and thus hold a reference to ZData objects and ZData objects in turn, even if explicitly deactivated, stay in memory. We can fix this all via implementing whole-btree deactivation procedure. To do so we need to iterate over all btree buckets recursively, but unfortunately there is no BTree API to access/iterate btree's buckets. We can however still get reference to first top-level buckets via gc.get_referents(btree) and then scan buckets further without hacks. gc.get_referents(btree) is a hack, but - it works in O(1) (we only get pointers from btree, not scanning all gcable objects and deducing them) - it works reliable if we filter out non-interesting objects. So in the end it works. Before the patch loading more and more ZBlk1 data with objgraph instrumentation was showing itself like # Nobj δ wendelin.bigfile.file_zodb.ZData 7168 +512 BTrees.IOBTree.IOBucket 238 +17 BTrees.IOBTree.IOBTree 14 +1 and after this patch we now have BTrees.IOBTree.IOBTree 14 +1 we cannot remove that "IOBTree + 1", since ZBlk1 is holding direct reference on it (via .chunktab) and we have to keep ZBlk1 live with ._v_zfile and ._v_zblk set for invalidation to work. "+1 IOBtree" is however small - 144 bytes per 2M (= 0.006%) so we can neglect that the same way we neglect keeping ZBlk1 staying live for each block.
-
- 02 Jun, 2015 1 commit
-
-
Kirill Smelkov authored
e.g. In [1]: multiply.reduce((1<<30, 1<<30, 1<<30)) Out[1]: 0 instead of In [2]: (1<<30) * (1<<30) * (1<<30) Out[2]: 1237940039285380274899124224 In [3]: 1<<90 Out[3]: 1237940039285380274899124224 also multiply.reduce returns int64, instead of python int: In [4]: type( multiply.reduce([1,2,3]) ) Out[4]: numpy.int64 which also leads to overflow-related problems if we further compute with this value and other integers and results exceeds int64 - it becomes float: In [5]: idx0_stop = 18446744073709551615 In [6]: stride0 = numpy.int64(1) In [7]: byte0_stop = idx0_stop * stride0 In [8]: byte0_stop Out[8]: 1.8446744073709552e+19 and then it becomes a real problem for BigArray.__getitem__() wendelin.core/bigarray/__init__.py:326: RuntimeWarning: overflow encountered in long_scalars page0_min = min(byte0_start, byte0_stop+byte0_stride) // pagesize # TODO -> fileh.pagesize and then > vma0 = self._fileh.mmap(page0_min, page0_max-page0_min+1) E TypeError: integer argument expected, got float ~~~~ So just avoid multiple.reduce() and do our own mul() properly the same way sum() is builtin into python, and we avoid overflow-related problems.
-