fixup! fixup! ZBigFile: Add ZBlk format option 'h' (heuristic) (4)
Take suggestions from Levin into account (nexedi/wendelin.core!20 (comment 198330)) : 1. appending can be False, even though we are appending (misleading name). 2. A big append uses ZBlk0 due to an if clause 25 lines later (logic is a bit far). 3. in the previous version it could happen that if a block was filled up with small appends (ZBlk1), it wasn't transformed to ZBlk0 in case the next block would be filled up with only one big append. 4. Regarding the actual algorithm, I wonder, why do we only use ZBlk0 for big appends in case it's the first append of a new ZBlk? Couldn't we generally say it's ok to use ZBlk0 in case of big appends? All these notes are valid. The problem comes from misleadin semantic attached to 'appending' name. From the name it indicates only appending, but sometimes we want to attach 'small' meaning to it and we were not doing it universally. -> Fix the problem by splitting 'appending' and 'small' into separate flags so that there is no room for confusion. -> Rework the flow of code so that all cases that related to appending are under one branch. -> Also optimize ndelta computation - when done in plain python just this part was taking a lot of time as timing for initial writeup showed: writeup with ZBlk0: ~20-25s writeup with ZBlk1: ~20-30s writeup with auto: was ~ 120s now, after switching to numpy for ndelta computation, whole runtime with 'auto' is taking ~ 35s. The whole runtime, if I observe benchmark execution correctly, is dominated by database writeup.
Showing
lib/tests/test_mem.py
0 → 100644