- 06 Jul, 2016 1 commit
-
-
Kirill Smelkov authored
@kazuhiko reports that wendelin.core build is currently broken on Python 3.5. Indeed it was: In file included from bigfile/_bigfile.c:37:0: ./include/wendelin/compat_py2.h: In function ‘_PyThreadState_UncheckedGetx’: ./include/wendelin/compat_py2.h:66:28: warning: implicit declaration of function ‘_Py_atomic_load_relaxed’ [-Wimplicit-function-declaration] return (PyThreadState*)_Py_atomic_load_relaxed(&_PyThreadState_Current); ^ ./include/wendelin/compat_py2.h:66:53: error: ‘_PyThreadState_Current’ undeclared (first use in this function) return (PyThreadState*)_Py_atomic_load_relaxed(&_PyThreadState_Current); ^ ./include/wendelin/compat_py2.h:66:53: note: each undeclared identifier is reported only once for each function it appears in ./include/wendelin/compat_py2.h:67:1: warning: control reaches end of non-void function [-Wreturn-type] } ^ The story here is that in 3.5 they decided to remove direct access to _PyThreadState_Current and atomic implementations - because that might semantically conflict with other headers implementing atomics - and provide only access by function. Starting from Python 3.5.2rc1 the function to get current thread state without asserting it is !NULL - _PyThreadState_UncheckedGet() - was added: https://github.com/python/cpython/commit/df858591 so for those python versions we can directly use it. After the fix wendelin.core tox tests pass under all python2.7, python3.4 and python3.5. More context here: https://bugs.python.org/issue26154 https://bugs.python.org/issue25150 Fixes: #1
-
- 01 Jul, 2016 1 commit
-
-
Kirill Smelkov authored
Now that ZODB 5.0 eggs are starting to appear (see e.g. [1] for context) let's limit ZODB4 test setup to actually install ZODB4, not 5. [1] https://groups.google.com/forum/#!topic/zodb/P05S0pyUbAM
-
- 13 Jun, 2016 2 commits
-
-
Kirill Smelkov authored
>= 1.6 was already using latest in 1.6 series, but >= 1.6.2 is more explicit. Also: in 1.6.2 NEO switched from MySQL-python to mysqlclient: neoppod@5f0c93f5 so we switch it too.
-
Kirill Smelkov authored
Namely 1.8.x, 1.9.x -> 1.10.x -> 1.11.x
-
- 15 Dec, 2015 1 commit
-
-
Kirill Smelkov authored
-
- 24 Sep, 2015 2 commits
-
-
Kirill Smelkov authored
Our current approach is that each file block is represented by 1 zodb object, with block size being 2M. Even with trailing \0 trimming, which halves the overhead on average, DB size grows very fast if we do a lot of small appends or changes. So another format needs to be introduced which has lower overhead for storing small changes: In general, to represent BigFile as ZODB objects, each file block could be represented separately either as 1) one ZODB object, or (ZBlk0 - this what we have already) 2) group of ZODB objects (ZBlk1 - this is what we introduce) with top-level BTree directory #blk -> objects representing block. For "1" we have - low-overhead access time (only 1 object loaded from DB), but - high-overhead in terms of ZODB size (with FileStorage / ZEO, every change to a block causes it to be written into DB in full again) For "2" we have - low-overhead in terms of ZODB size (only part of a block is overwritten in DB on single change), but - high-overhead in terms of access time (several objects need to be loaded for 1 block) In general it is not possible to have low-overhead for both i) access-time, and ii) DB size, with approach where we do block objects representation / management on *client* side. On the other hand, if object management is moved to DB *server* side, it is possible to deduplicate them there and this way have low-overhead for both access-time and DB size with just client storing 1 object per file block. This will be our future approach after we teach NEO about object deduplication. ~~~~ As shown above in the last paragraph it is not possible to perform optimally on client side. Thus ZBlk1 should be only an intermediate solution until we move data management to DB server side, with main criteria for ZBlk1 to keep it simple. In this patch a simple scheme is used, where every block is divided into chunks organized via BTree. When a block part changes, only corresponding chunk is updated. Chunk size is chosen to be 4K which creates ~ 512 fanout for 2M block. DB size after tests is changed as follows: bigfile bigarray ZBlk0 24K 6200K ZBlk1 36K 36K ( slight size increase for bigfile tests is because of btree structures overhead ) Time to run tests stays approximately the same. /cc @Tyagov, @klaus
-
Kirill Smelkov authored
- current ZBlk becomes format 0 - write format can be selected via WENDELIN_CORE_ZBLK_FMT env var - upon writing a block we always make sure we write it in current write format - so if a block was previously written in one format, it could be changed on the next write. - tox is prepared to test all write formats (so far only ZBlk0 there). The reason is - in the next patch we'll introduce another format for blocks which is optimized for small changes.
-
- 06 Aug, 2015 1 commit
-
- 26 Jun, 2015 1 commit
-
-
Kirill Smelkov authored
/cc @jm
-
- 28 May, 2015 1 commit
-
-
Kirill Smelkov authored
It was hanging with NumPy-1.9 before 425dc5d1 (bigarray: Raise IndexError for out-of-bound element access), because of the following correct NumPy commit: https://github.com/numpy/numpy/commit/d36f8227 and in particular https://github.com/numpy/numpy/commit/d36f8227#diff-6d326badc0872de91e025cbfb0be1aafR522 That PySequence_Fast(obj) (with obj being BigArray) creates iterator on top of obj and before our previous IndexError fix in 425dc5d1, this was looping forever. Test explicitly with both NumPy 1.8 and NumPy 1.9, that this construct does not hang. /cc @Tyagov
-
- 03 Apr, 2015 1 commit
-
-
Kirill Smelkov authored
-