- 28 Oct, 2021 14 commits
-
-
Kirill Smelkov authored
ΔBtail provides BTree-level history tail that WCFS - via ΔFtail - will use to compute which blocks of a ZBigFile need to be invalidated in OS file cache given raw ZODB changes on ZODB invalidation message. It also will be used by WCFS to implement isolation protocol, where on every FUSE READ request WCFS will query ΔBtail - again via ΔFtail - to find out revision of corresponding file block. Quoting ΔBtail documentation: ---- 8< ---- ΔBtail provides BTree-level history tail. It translates ZODB object-level changes to information about which keys of which BTree were modified, and provides service to query that information. ΔBtail class documentation ~~~~~~~~~~~~~~~~~~~~~~~~~~ ΔBtail represents tail of revisional changes to BTrees. It semantically consists of []δB ; rev ∈ (tail, head] where δB represents a change in BTrees space δB: .rev↑ {} root -> {}(key, δvalue) It covers only changes to keys from tracked subset of BTrees parts. In particular a key that was not explicitly requested to be tracked, even if it was changed in δZ, is not guaranteed to be present in δB. ΔBtail provides the following operations: .Track(path) - start tracking tree nodes and keys; root=path[0], keys=path[-1].(lo,hi] .Update(δZ) -> δB - update BTree δ tail given raw ZODB changes .ForgetPast(revCut) - forget changes ≤ revCut .SliceByRev(lo, hi) -> []δB - query for all trees changes with rev ∈ (lo, hi] .SliceByRootRev(root, lo, hi) -> []δT - query for changes of a tree with rev ∈ (lo, hi] .GetAt(root, key, at) -> (value, rev) - get root[key] @at assuming root[key] ∈ tracked where δT represents a change to one tree δT: .rev↑ {}(key, δvalue) An example for tracked set is a set of visited BTree paths. There is no requirement that tracked set belongs to only one single BTree. See also zodb.ΔTail and zdata.ΔFtail Concurrency ΔBtail is safe to use in single-writer / multiple-readers mode. That is at any time there should be either only sole writer, or, potentially several simultaneous readers. The table below classifies operations: Writers: Update, ForgetPast Readers: Track + all queries (SliceByRev, SliceByRootRev, GetAt) Note that, in particular, it is correct to run multiple Track and queries requests simultaneously. ΔBtail organization ~~~~~~~~~~~~~~~~~~~ ΔBtail keeps raw ZODB history in ΔZtail and uses BTree-diff algorithm(*) to turn δZ into BTree-level diff. For each tracked BTree a separate ΔTtail is maintained with tree-level history in ΔTtail.vδT . Because it is very computationally expensive(+) to find out for an object to which BTree it belongs, ΔBtail cannot provide full BTree-level history given just ΔZtail with δZ changes. Due to this ΔBtail requires help from users, which are expected to call ΔBtail.Track(treepath) to let ΔBtail know that such and such ZODB objects constitute a path from root of a tree to some of its leaf. After Track call the objects from the path and tree keys, that are covered by leaf node, become tracked: from now-on ΔBtail will detect and provide BTree-level changes caused by any change of tracked tree objects or tracked keys. This guarantee can be provided because ΔBtail now knows that such and such objects belong to a particular tree. To manage knowledge which tree part is tracked ΔBtail uses PPTreeSubSet. This data-structure represents so-called PP-connected set of tree nodes: simply speaking it builds on some leafs and then includes parent(leaf), parent(parent(leaf)), etc. In other words it's a "parent"-closure of the leafs. The property of being PP-connected means that starting from any node from such set, it is always possible to reach root node by traversing .parent links, and that every intermediate node went-through during traversal also belongs to the set. A new Track request potentially grows tracked keys coverage. Due to this, on a query, ΔBtail needs to recompute potentially whole vδT of the affected tree. This recomputation is managed by "vδTSnapForTracked*" and "_rebuild" functions and uses the same treediff algorithm, that Update is using, but modulo PPTreeSubSet corresponding to δ key coverage. Update also potentially needs to rebuild whole vδT history, not only append new δT, because a change to tracked tree nodes can result in growth of tracked key coverage. Queries are relatively straightforward code that work on vδT snapshot. The main complexity, besides BTree-diff algorithm, lies in recomputing vδT when set of tracked keys changes, and in handling that recomputation in such a way that multiple Track and queries requests could be all served in parallel. Concurrency In order to allow multiple Track and queries requests to be served in parallel ΔBtail employs special organization of vδT rebuild process where complexity of concurrency is reduced to math on merging updates to vδT and trackSet, and on key range lookup: 1. vδT is managed under read-copy-update (RCU) discipline: before making any vδT change the mutator atomically clones whole vδT and applies its change to the clone. This way a query, once it retrieves vδT snapshot, does not need to further synchronize with vδT mutators, and can rely on that retrieved vδT snapshot will remain immutable. 2. a Track request goes through 3 states: "new", "handle-in-progress" and "handled". At each state keys/nodes of the Track are maintained in: - ΔTtail.ktrackNew and .trackNew for "new", - ΔTtail.krebuildJobs for "handle-in-progress", and - ΔBtail.trackSet for "handled". trackSet keeps nodes, and implicitly keys, from all handled Track requests. For all keys, covered by trackSet, vδT is fully computed. a new Track(keycov, path) is remembered in ktrackNew and trackNew to be further processed when a query should need keys from keycov. vδT is not yet providing data for keycov keys. when a Track request starts to be processed, its keys and nodes are moved from ktrackNew/trackNew into krebuildJobs. vδT is not yet providing data for requested-to-be-tracked keys. all trackSet, trackNew/ktrackNew and krebuildJobs are completely disjoint: trackSet ^ trackNew = ø trackSet ^ krebuildJobs = ø trackNew ^ krebuildJobs = ø 3. when a query is served, it needs to retrieve vδT snapshot that takes related previous Track requests into account. Retrieving such snapshots is implemented in vδTSnapForTracked*() family of functions: there it checks ktrackNew/trackNew, and if those sets overlap with query's keys of interest, run vδT rebuild for keys queued in ktrackNew. the main part of that rebuild can be run without any locks, because it does not use nor modify any ΔBtail data, and for δ(vδT) it just computes a fresh full vδT build modulo retrieved ktrackNew. Only after that computation is complete, ΔBtail is locked again to quickly merge in δ(vδT) update back into vδT. This organization is based on the fact that vδT/(T₁∪T₂) = vδT/T₁ | vδT/T₂ ( i.e. vδT computed for tracked set being union of T₁ and T₂ is the same as merge of vδT computed for tracked set T₁ and vδT computed for tracked set T₂ ) and that trackSet | (δPP₁|δPP₂) = (trackSet|δPP₁) | (trackSet|δPP₂) ( i.e. tracking set updated for union of δPP₁ and δPP₂ is the same as union of tracking set updated with δPP₁ and tracking set updated with δPP₂ ) these merge properties allow to run computation for δ(vδT) and δ(trackSet) independently and with ΔBtail unlocked, which in turn enables running several Track/queries in parallel. 4. while vδT rebuild is being run, krebuildJobs keeps corresponding keycov entry to indicate in-progress rebuild. Should a query need vδT for keys from that job, it first waits for corresponding job(s) to complete. Explained rebuild organization allows non-overlapping queries/track-requests to run simultaneously. (This property is essential to WCFS because otherwise WCFS would not be able to serve several non-overlapping READ requests to one file in parallel.) -------- (*) implemented in treediff.go (+) full database scan ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ Some preliminary history: 877e64a9 X wcfs: Fix tests to pass again c32055fc X wcfs/xbtree: ΔBtail tests += ø -> Tree; Tree -> ø 78f2f88b X wcfs/xbtree: Fix treediff(a, ø) 5324547c X wcfs/xbtree: root(a) must stay in trackSet even after treediff(a,ø) f65f775b X wcfs/xbtree: treediff(ø, b) c75b1c6f X wcfs/xbtree: Start killing holeIdx 0fa06cbd X kadj must be taken into account as kadj^δZ ef5e5183 X treediff ret += δtkeycov f30826a6 X another bug in δtkeyconv computation 0917380e X wcfs: assert that keycov only grow 502e05c2 X found why TestΔBTailAllStructs was not effective to find δtkeycov bugs 450ba707 X Fix rebuild with ø @at2 f60528c9 X ΔBtail.Clone had bug that it was aliasing klon and orig data 9d20f8e8 X treediff: Fix BUG while computing AB coverage ddb28043 X rebuild: Don't return nil for empty ΔPPTreeSubSet - that leads to SIGSEGV 324241eb X rebuild: tests: Don't reflect.DeepEqual in inner loop 8f6e2b1e X rebuild: tests: Don't access ZODB in XGetδKV 2c0b4793 X rebuild: tests: Don't access ZODB in xtrackKeys 8f0e37f2 X rebuild: tests: Precompute kadj10·kadj21 271d953d X rebuild: tests: Move ΔBtail.Clone test out of hot inner loop into separate test a87cc6de X rebuild: tests: Don't recompute trackSet(keys1R2) several times 01433e96 X rebuild: tests: Don't compute keyCover in trackSet 7371f9c5 X rebuild: tests: Inline _assertTrack 3e9164b3 X rebuild: tests: Don't exercise keys from keys2 that already became tracked after Track(keys1) + Update e9c4b619 X rebuild: tests: Random testing d0fe680a X δbtail += ForgetPast 210e9b07 X Fix ΔBtail.SliceByRootRev (lo,hi] handling 855ab4b8 X ΔBtail: Goodbye .KVAtTail 2f5582e6 X ΔBtail: Tweak tests to run faster in normal mode cf352737 X random testing found another failing test for rebuild... 7f7e34e0 X wcfs/xbtree: Fix update not to add duplicate extra point if rebuild - called by Update - already added it 6ad0052c X ΔBtail.Track: No need to return error aafcacdf X xbtree: GetAt test 784a6761 X xbtree: Fix KAdj definition after treediff was reworked this summer to base decisions on node keycoverage instead of particular node keys 0bb1c22e X xbtree: Verify that ForgetPast clones vδT on trim a8945cbf X Start reworking rebuild routines not to modify data inplace b74dda09 X Start switching Track from Track(key) to Track(keycov) dea85e87 X Switch GetAt to vδTSnapForTrackedKey aa0288ce X Switch SliceByRootRev to vδTSnapForTracked c4366b14 X xbtree: tests: Also verify state of ΔTtail.ktrackNew b98706ad X Track should be nop if keycov/path is already in krebuildJobs e141848a X test.go ↑ timeout 10m -> 20m 423f77be X wcfs: Goodby holeIdx 37c2e806 X wcfs: Teach treediff to compute not only δtrack (set of nodes), but also δ for track-key coverage 52c72dbb X ΔBtail.rebuild started to work draftly c9f13fc7 X Get rebuild tests to run in a sane time; Add proper random-based testing for rebuild c7f1e3c9 X xbtree: Factor testing infrastructure bits into xbtree/xbtreetest 7602c1f4 ΔBtail concurrency
-
Kirill Smelkov authored
This algorithm will be internally used by ΔBtail in the next patch. The algorithm would be simple, if we would need to diff two trees completely. However in ΔBtail only subpart of BTree nodes are tracked(*) and the diff has to work modulo that tracking set. No tests now because ΔBtail tests will cover treediff functionality as well. Some preliminary history: kirr/wendelin.core@78f2f88b X wcfs/xbtree: Fix treediff(a, ø) kirr/wendelin.core@5324547c X wcfs/xbtree: root(a) must stay in trackSet even after treediff(a,ø) kirr/wendelin.core@f65f775b X wcfs/xbtree: treediff(ø, b) kirr/wendelin.core@c75b1c6f X wcfs/xbtree: Start killing holeIdx kirr/wendelin.core@ef5e5183 X treediff ret += δtkeycov kirr/wendelin.core@9d20f8e8 X treediff: Fix BUG while computing AB coverage kirr/wendelin.core@ddb28043 X rebuild: Don't return nil for empty ΔPPTreeSubSet - that leads to SIGSEGV kirr/wendelin.core@f68398c9 X wcfs: Move treediff into its own file (*) because full BTree scan is needed to discover all of its nodes. Quoting treediff documentation: ---- 8< ---- treediff provides diff for BTrees Use δZConnectTracked + treediff to compute BTree-diff caused by δZ: δZConnectTracked(δZ, trackSet) -> δZTC, δtopsByRoot treediff(root, δtops, δZTC, trackSet, zconn{Old,New}) -> δT, δtrack, δtkeycov δZConnectTracked computes BTree-connected closure of δZ modulo tracked set and also returns δtopsByRoot to indicate which tree objects were changed and in which subtree parts. With that information one can call treediff for each changed root to compute BTree-diff and δ for trackSet itself. BTree diff algorithm diffT, diffB and δMerge constitute the diff algorithm implementation. diff(A,B) works on pair of A and B whole key ranges splitted into regions covered by tree nodes. The splitting represents current state of recursion into corresponding tree. If a node in particular key range is Bucket, that bucket contributes to δ- in case of A, and to δ+ in case of B. If a node in particular key range is Tree, the algorithm may want to expand that tree node into its children and to recourse into some of the children. There are two phases: - Phase 1 expands A top->down driven by δZTC, adds reached buckets to δ-, and queues key regions of those buckets to be processed on B. - Phase 2 starts processing from queued key regions, expands them on B and adds reached buckets to δ+. Then it iterates to reach consistency in between A and B because processing buckets on B side may increase δ key coverage, and so corresponding key ranges has to be again processed on A. Which in turn may increase δ key coverage again, and needs to be processed on B side, etc... The final δ is merge of δ- and δ+. diffT has more detailed explanation of phase 1 and phase 2 logic.
-
Kirill Smelkov authored
This data structures will be used in ΔBtail to maintain sef of tracked BTree nodes, and to represent δ to such set. Some preliminary history: kirr/wendelin.core@78f2f88b X wcfs/xbtree: Fix treediff(a, ø) kirr/wendelin.core@5324547c X wcfs/xbtree: root(a) must stay in trackSet even after treediff(a,ø) kirr/wendelin.core@f65f775b X wcfs/xbtree: treediff(ø, b) kirr/wendelin.core@66bc41ce X Fix bug in PPTreeSubSet.Difference - it was always leaving root node alive kirr/wendelin.core@ddb28043 X rebuild: Don't return nil for empty ΔPPTreeSubSet - that leads to SIGSEGV kirr/wendelin.core@a87cc6de X rebuild: tests: Don't recompute trackSet(keys1R2) several times Quoting PPTreeSubSet and ΔPPTreeSubSet documentation: ---- 8< ---- PPTreeSubSet represents PP-connected subset of tree node objects. It is PP(xleafs) where PP(node) maps node to {node, node.parent, node.parent,parent, ...} up to top root from where the node is reached. The nodes in the set are represented by their Oid. Usually PPTreeSubSet is built as PP(some-leafs), but in general the starting nodes are arbitrary. PPTreeSubSet can also have many root nodes, thus not necessarily representing a subset of a single tree. Usual set operations are provided: Union, Difference and Intersection. Nodes can be added into the set via AddPath. Path is reverse operation - it returns path to tree node given its oid. Every node in the set comes with .parent pointer. ~~~~ ΔPPTreeSubSet represents a change to PPTreeSubSet. It can be applied via PPTreeSubSet.ApplyΔ . The result B of applying δ to A is: B = A.xDifference(δ.Del).xUnion(δ.Add) (*) (*) NOTE δ.Del and δ.Add might have their leafs starting from non-leaf nodes in A/B. This situation arises when δ represents a change in path to particular node, but that node itself does not change, for example: c* c / \ / 41* 42 41 | | | \ 22 43 46 43 | | | 44 22 44 Here nodes {c, 41} are changed, node 42 is unlinked, and node 46 is added. Nodes 43 and 44 stay unchanged. δ.Del = c-42-43 | c-41-22 δ.Add = c-41-43 | c-41-46-22 The second component with "-22" builds from leaf, but the first component with "-43" builds from non-leaf node. ΔnchildNonLeafs = {43: +1} Only complete result of applying all - xfixup(-1, ΔnchildNonLeafs) - δ.Del, - δ.Add, and - xfixup(+1, ΔnchildNonLeafs) produces correctly PP-connected set.
-
Kirill Smelkov authored
RangedMap is Key->VALUE map with adjacent keys mapped to the same value coalesced into Ranges. RangedKeySet is set of Keys with adjacent keys coalesced into Ranges. This data structures will be needed for ΔBtail. For now the implementation is simple since it keeps whole map in a linear slice because both RangedMap and RangedKeySet will be used in ΔBtail to keep something proportional to δ of a change, which is assumed to be small or medium most of the time. Some preliminary history: kirr/wendelin.core@6ea5920a X xbtree: Less copy/garbage in RangedKeySet ops kirr/wendelin.core@3ecacd99 X need to keep Value first so that sizeof(set-entry) = sizeof(KeyRange) kirr/wendelin.core@a5b9b19b X SetRange draftly works kirr/wendelin.core@ed2de0de X Tests for Get kirr/wendelin.core@3b7b69e6 X fixes for empty set/range kirr/wendelin.core@6972f999 X xbtree/blib: RangedMap, RangedSet += IntersectsRange, Intersection kirr/wendelin.core@57be0126 X RangedMap - like RangedSet but for dict
-
Kirill Smelkov authored
Add treeenv.go that combines Treegen and client side access to ZODB with committed trees as extension to testing.T . The environment allows to easily see which tree update was committed, what is the difference in terms of KV, what is the state of updated tree and state of pointed-to ZBlk objects. This will be used to test upcoming ΔBtail and ΔFtail. Main functionality is in treeenv.go; the other added files are to support that. Some preliminary history: kirr/wendelin.core@f07502fc X xbtreetest: Teach T & Commit to automatically provide At in symbolic form kirr/wendelin.core@0d62b05e X Adjust to btree.VGet & friends signature change to include keycov in visit callback kirr/wendelin.core@588a512a X zdata: Switch SliceByFileRev not to clone Zinblk kirr/wendelin.core@e9c4b619 X rebuild: tests: Random testing kirr/wendelin.core@43090ac7 X tests: Factor-out tree-test-env into tTreeEnv kirr/wendelin.core@d4a523b2 X δbtail: tests: Run much faster with live ZODB cache kirr/wendelin.core@271d953d X rebuild: tests: Move ΔBtail.Clone test out of hot inner loop into separate test kirr/wendelin.core@c32055fc X wcfs/xbtree: ΔBtail tests += ø -> Tree; Tree -> ø kirr/wendelin.core@5324547c X wcfs/xbtree: root(a) must stay in trackSet even after treediff(a,ø) kirr/wendelin.core@8f6e2b1e X rebuild: tests: Don't access ZODB in XGetδKV
-
Kirill Smelkov authored
Lacking generics we have set.go.in and instantiation for Set[int64], set[string], Set[Oid] and Set[Tid] - that will be used in follow-up patches. The set.go.in itself is mostly a generalized copy from git-backup: https://lab.nexedi.com/kirr/git-backup/blob/c9db60e8/set.go
-
Kirill Smelkov authored
treegen.go and treegen.py together provide a way - to commit a particular BTree topology into ZODB, and - to generate set of random tree topologies that all correspond to particular {k->v} dict. this will be used in upcoming ΔBtail and ΔFtail tests. See treegen.py documentation for details. Some preliminary history: kirr/wendelin.core@9eca74ec X Teach AllStructs to emit topologies with values kirr/wendelin.core@1b962f03 X Restructure: found bug that it was not marking objects as modified kirr/wendelin.core@2139af2c X treegen: Verify that tree actually saved to storage is what was requested kirr/wendelin.core@b5e39d4a X wcfs/treegen: allstructs: Do not keep all tree structures in memory kirr/wendelin.core@e9c4b619 X rebuild: tests: Random testing kirr/wendelin.core@c32055fc X wcfs/xbtree: ΔBtail tests += ø -> Tree; Tree -> ø kirr/wendelin.core@4300d88a X wcfs/xbtreetest/treegen.py: Fix it on ZODB4
-
Kirill Smelkov authored
This will be the place to keep BTree-related utilities. For now it provides only type aliases since Go lacks generics.
-
Kirill Smelkov authored
To handle invalidations, WCFS will need to detect changes to both ZBlk objects and to ZBigFile.blktab BTree that is mapping file blocks to ZBlk objects. And with BTree detecting changes is much more complex, because when a BTree changes, it might be rebalanced, or keys migrated from one tree/bucket node to another tree/bucket node. In other words a BTree change might be not only a change to a {}key->value dictionary, but also a change to BTree topology. Because there are many BTree topologies that correspond to the same {}key->value state, a change from kv₁ to kv₂, even if kv₁ and kv₂ are close to each other, might be accompanied by a dramatic change to topology of the tree. This creates a need for thoroughly testing the BTree difference algorithm because many of BTree topologies changes are tricky, and if a simple algorithm works on relatively stable topology updates, it does not necessarily mean that that same algorithm will continue to work correctly in the general case. So, as a preparatory step, here comes xbtree.py package, that can be used to inspect tree topologies, to create trees with specified topology and to manipulate topology of an existing tree. This package will be used in tests for upcoming ΔBtail. For debugging, and also since those tests will involve both Go and Python parts, it creates the need to be able to specify and exchange topology of a tree via compact string. This package also defines so called "topology encoding" to do so. Some preliminar history: kirr/wendelin.core@fb56193f X fix metric to keep Z <- N order stable over key^ kirr/wendelin.core@809304d1 X "B:" indicates ø bucket with k&b, "B" - ø bucket with only keys kirr/wendelin.core@9eca74ec X Teach AllStructs to emit topologies with values kirr/wendelin.core@1b962f03 X Restructure: found bug that it was not marking objects as modified kirr/wendelin.core@9181c5d9 X Restructure; verify that it marks as changed only modifed nodes kirr/wendelin.core@e9902c4a X improve `xbtree topoview` For the reference xbtree.py package documentation is quoted below. ---- 8< ---- Package xbtree provides utilities for inspecting/manipulating internal structure of integer-keyed BTrees. It will be primarily used to help verify ΔBTail in WCFS. - `Tree` represents a tree node. - `Bucket` represents a bucket node. - `StructureOf` returns internal structure of ZODB BTree represented as Tree and Bucket nodes. - `Restructure` reorganizes ZODB BTree instance according to specified topology structure. - `AllStructs` generates all possible BTree topology structures with given keys. Topology encoding ----------------- Topology encoding provides way to represent structure of a Tree as path-like string. TopoEncode converts Tree into its topology-encoded representation, while TopoDecode decodes topology-encoded string back into Tree. The following example illustrates topology encoding represented by string "T3/T-T/B1-T5/B-B7,8,9": [ 3 ] T3/ represents Tree([3]) / \ [ ] [ ] T-T/ represents two empty Tree([]) ↓ ↓ |1|[ 5 ] B1-T5/ represent Bucket([1]) and Tree([5]) / \ || |7|8|9| B-B7,8,9 represents empty Bucket([]) and Bucket([7,8,9]) Topology encoding specification: A Tree is encoded by level-order traversal, delimiting layers with "/". Inside a layer Tree and Bucket nodes are signalled as "T<keys>" ; Tree "B<keys>" ; Bucket with only keys "B<keys+values>" ; Bucket with keys and values Keys are represented as ","-delimited list of integers. For example Tree or Bucket with [1,3,5] keys are represented as "T1,3,5" ; Tree([1,3,5]) "B1,3,5" ; Bucket([1,3,5]) Keys+values are represented as ","-delimited list of "<key>:<value>" pairs. For example Bucket corresponding to {1:1, 2:4, 3:9} is represented as "B1:1,2:4,3:9" ; Bucket([1,2,3], [1,4,9]) Empty keys+values are represented as ":" - an empty Bucket for key->value mapping is represented as "B:" ; Bucket([], []) Nodes inside one layer are delimited with "-". For example a layer consisting of an empty Tree, a Tree with [1,3] keys, and Bucket with [4,5] keys is represented as "T-T1,3-B4,5" ; layer with Tree([]), Tree([1,3]) and Bucket([4,5]) A layer consists of nodes that are followed by node-node links from upper layer in left-to-right order. Visualization ------------- The following visualization utilities are provided to help understand BTrees better: - `topoview` displays BTree structure given its topology-encoded representation. - `Tree.graphviz` returns Tree graph representation in dot language.
-
Kirill Smelkov authored
For WCFS to be efficient it will have to carefully preserve OS cache on file invalidations. As preparatory step establish infrastructure for verifying state of OS file cache and start asserting on OS cache state in a couple of places. See comments added to tFile constructor that describe how OS cache state verification is setup. Some preliminary history: kirr/wendelin.core@8293025b X Thoughts on how to avoid readahead touching pages of neighbour block kirr/wendelin.core@3054e4a3 X not touching neighbour block works via setting MADV_RANDOM in last 1/4 of every block kirr/wendelin.core@18362227 X #5 access still triggers read to #4 ? kirr/wendelin.core@17dbf94e X Provide mlock2 fallback for Ubuntu kirr/wendelin.core@d134c0b9 X wcfs: test: try to live with only hard memlock limit adjusted kirr/wendelin.core@c2423296 X Fix mlock2 build on Debian 8
-
Kirill Smelkov authored
Provide filesystem view of in-ZODB ZBigFiles, but do not implement support for invalidations nor isolation protocol yet. In particular, because ZODB invalidations are not yet handled, the filesystem does not update its data in accordance with ZODB updates, and instead provides stale data view that corresponds to the state of ZODB at the time when wcfs was mounted. The main parts of this patch are: - wcfs/wcfs.go is filesystem implementation itself together with overview. - wcfs/__init__.py is python wrapper to spawn and interoperate with that filesystem. - wcfs/wcfs_test.py is tests. Some preliminary history: kirr/wendelin.core@fe7efb94 X start of wcfs kirr/wendelin.core@878b2787 X draft loading kirr/wendelin.core@d58c71e8 X don't overalign end by 1 blksize if end is already aligned kirr/wendelin.core@29c9f13d X readBlk: Fix thinko in already case kirr/wendelin.core@59552328 X wcfs: Care to disable OS polling on us kirr/wendelin.core@c00d94c7 X workaround lack of exception chaining on Python2 with xdefer kirr/wendelin.core@0398e23d X bytearray turned out to be copying data kirr/wendelin.core@7a837040 X print wcfs.py py-level traceback on SIGBUS (e.g. wcfs.go aborting due to bug/panic) kirr/wendelin.core@661b871f X make sure tests don't get stuck even if wcfs gets killed -9 ... kirr/wendelin.core@2c043d29 X More effort to unmount failed wcfs.go kirr/wendelin.core@1ccc4478 X Use `with gil` + regular py code instead of PyGILState_Ensure/PyGILState_Release/PyRun_SimpleString kirr/wendelin.core@5dc9c791 X wcfs: Kill xdefer kirr/wendelin.core@91e9eba8 X wcfs: test: Register tFile to tDB early kirr/wendelin.core@a7138fef X wcfs: mkdir /tmp/wcfs with sticky bit kirr/wendelin.core@1eec76d0 X wcfs: try to set sticky for /tmp/wcfs even if the directory already exists kirr/wendelin.core@c2c35851 X wcfs: tests: Factor-out waiting for a general condition to become true into waitfor kirr/wendelin.core@78f36993 X wcfs: test: Fix thinko in getting /sys/fs/fuse/connection/<X> for wcfs kirr/wendelin.core@bc9eb16f X wcfs: tests: Don't use testmntpt everywhere kirr/wendelin.core@6dec74e7 X wcfs: tests: Split tDB into -> tDB + tWCFS kirr/wendelin.core@3a6bd764 X wcfs: tests: Run `fusermount -u` the second time if we had to kill wcfs kirr/wendelin.core@112720f3 X wcfs: tests: Print which files are still opened on wcfs if `fusermount -u` fails kirr/wendelin.core@bb40185b X wcfs: Take $WENDELIN_CORE_WCFS_OPTIONS into account not only from under join kirr/wendelin.core@03a9ef33 X wcfs: Remove credentials from zurl when computing wcfs mountpoint kirr/wendelin.core@68ee5bdc X wcfs: lsof tweaks kirr/wendelin.core@21671879 X wcfs: Teach entrypoint frontend to handle subcommands: serve, status, stop kirr/wendelin.core@b0642b80 X wcfs: Switch mountpoints from /tmp/wcfs/* to /dev/shm/* kirr/wendelin.core@b0ca031f X wcfs: Teach join/serve to start successfully even after unclean wcfs shutdown kirr/wendelin.core@5bfa8cf8 X wcfs: Add start to spawn a Server that can be later stopped (draft) kirr/wendelin.core@5fcec261 X wcfs: Run fusermount and friends with /bin:/usr/bin always on path kirr/wendelin.core@669d7a20 fixup! X wcfs: Run fusermount and friends with /bin:/usr/bin always on path kirr/wendelin.core@6b22f8c4 X wcfs: Teach start to start successfully even after unclean wcfs shutdown kirr/wendelin.core@15389db0 X wcfs: Tune _fuse_unmount to include `fusermount -u` error message into raised exception kirr/wendelin.core@153c002a X wcfs: _fuse_unmount: Try first `kill -TERM` before `kill -QUIT` wcfs kirr/wendelin.core@3244f3a6 X wcfs: lsof +D misbehaves - don't use it kirr/wendelin.core@a126e709 X wcfs: Put client log into its own logger kirr/wendelin.core@ac303d1e X wcfs: tests: -v -> show only wcfs.py logs verbosely kirr/wendelin.core@d671a9e9 X wcfs: Give more time to stop wcfs server
-
Kirill Smelkov authored
Add functionality to load objects from ZODB as saved by py wendelin.core. Mostly straightforward code. The main part is in zblk.go . Contrary to python implementation, go can load ZBlk1's subobjects in parallel, which, given scalable ZODB storage, can be significantly faster compared to serially loading all ZData subobjects as py code does. TODO test wrt data saved by Python3. Some preliminary history: 878b2787 X draft loading bf9a7405 X No longer rely on ZODB cache invariant for invalidations 0d62b05e X Adjust to btree.VGet & friends signature change to include keycov in visit callback b74dda09 X Start switching Track from Track(key) to Track(keycov)
-
Kirill Smelkov authored
Add initial stub for WCFS program and tests. WCFS functionality will be added step-by-step in follow-up commits. Some preliminary history: kirr/wendelin.core@0ae88a32 X .nxdtest: Verify Go bits with GOMAXPROCS=1,2,`nproc` kirr/wendelin.core@23528eb4 X wcfs: make it to use go modules for dependencies
-
Kirill Smelkov authored
In 6637d216 (lib/zodb: Add zstor_2zurl - way to convert a ZODB storage into URL to access it) we added zstor_2zurl function to convert a ZODB storage client object into an URL to access the storage. At that time the function knew how to understand FileStorage only. Let's add support for other storages that WCFS will need to support now. NEO URI scheme matches the one currently used on ZODB/go side. It semantically needs nexedi/neoppod!18 to be also applied to NEO/py side, but we do not care for now that that patch is not merged (yet, or forever) because extracted ZURL is used only with WCFS which uses NEO/go. NEO support also depends on custom patch to remember SSL credentials on NEO Client: kirr/neo@a2f192cb Some preliminary history: kirr/wendelin.core@5cb39463 fixup! X wcfs/zeo started to work locally kirr/wendelin.core@1cf3b228 X zstor_2zurl += NEO kirr/wendelin.core@7f8fa32a X lib/zodb: zstor_2zurl += NEO/SSL support kirr/wendelin.core@e26524df X wcfs, lib/zodb: DemoStorage support
-
- 25 Oct, 2021 8 commits
-
-
Kirill Smelkov authored
Upcoming libwcfs (C++ part of WCFS client) will need to use virtmem code and link to libvirtmem.
-
Kirill Smelkov authored
Manaully, because there is no automatic dependency tracking in setuptools... Dependency tracking is needed to avoid miscompilation after incremental update under SlapOS/buildout/testnode/... when e.g. only .h was changed.
-
Kirill Smelkov authored
This is similar to e870781d (Top-level in-tree import redirector) but for upcoming pyx modules.
-
Kirill Smelkov authored
Soon we are going to split virtmem code into its own DSO to which bigfile extension will link. As plain setuptools does not support such dynamic linking, we are going to use setuptools_dso instead. But more: some of our upcoming extensions and DSOs will need to use Cython and C++ parts of Pygolang. Prepare that and use Extensions and DSO from golang.pyx.build to support that right from the start.
-
Kirill Smelkov authored
Currently we have only one extension wendelin.bigfile._bigfile, but we are going to add more both python extensions and non-python DSOs. Start preparing to that by factoring-out common code.
-
Kirill Smelkov authored
lib/tests/testprog/zloadrace.py:90:1 'ZODB.FileStorage.FileStorage' imported but unused This amends commit c37a989d.
-
Kirill Smelkov authored
Do what we can do without gdb and then tail to regular segmentation fault. With core file gdb can still be used, but it is handy if we already can get traceback of the crash into the log automatically. TODO better use https://github.com/ianlancetaylor/libbacktrace because backtrace_symbols often does not provide symbolic information. We do not do this now because libbacktrace is not always automatically installed.
-
Kirill Smelkov authored
This makes sure that those programs are always built afresh instead being stuck at outdated build. This is needed because corresponding test .c file includes many other .c files and we don't implement dependency tracking.
-
- 01 Apr, 2021 3 commits
-
-
Kirill Smelkov authored
Else, e.g. after a failing test, that closed its storage and DB, but not all Connections, another test, just by starting new transaction, would invoke synchronization on that unclosed connection, which will try to access closed storage and likely fail. Fixes e.g. https://nexedijs.erp5.net/#/test_result_module/20210401-31B27B3D/5 Crash scenariou is the same as described in 5a5ed2c7 (tests: Force-close ZODB connections in teardown, that testing code forgot to explicitly close). Only now we try to isolate tests from each other not only for different modules, but also for tests inside the same module.
-
Kirill Smelkov authored
The tests verify that there is no concurrency bugs around load, Connection.open and invalidations. See e.g. https://github.com/zopefoundation/ZODB/issues/290 https://github.com/zopefoundation/ZEO/issues/155 By including the tests into wendelin.core, we will have CI coverage for all supported storages (FileStorage, ZEO, NEO), and for all supported ZODB (currently ZODB4, ZODB4-wc2 and ZODB5). ZEO5 is know to currently fail zloadrace. However, even though ZODB#290 was fixed, ZEO5 turned out to also fail on zopenrace: def test_zodb_zopenrace(): # exercises ZODB.Connection + particular storage implementation > zopenrace.main() lib/tests/test_zodb.py:382: _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ <decorator-gen-1>:2: in main ??? ../../tools/go/pygolang/golang/__init__.py:103: in _ return f(*argv, **kw) lib/tests/testprog/zopenrace.py:115: in main test(zstor) <decorator-gen-2>:2: in test ??? ../../tools/go/pygolang/golang/__init__.py:103: in _ return f(*argv, **kw) lib/tests/testprog/zopenrace.py:201: in test wg.wait() golang/_sync.pyx:246: in golang._sync.PyWorkGroup.wait ??? golang/_sync.pyx:226: in golang._sync.PyWorkGroup.go.pyrunf ??? lib/tests/testprog/zopenrace.py:165: in T1 t1() _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ def t1(): transaction.begin() zconn = db.open() root = zconn.root() obj1 = root['obj1'] obj2 = root['obj2'] # obj1 - reload it from zstor # obj2 - get it from zconn cache obj1._p_invalidate() # both objects must have the same values i1 = obj1.i i2 = obj2.i if i1 != i2: > raise AssertionError("T1: obj1.i (%d) != obj2.i (%d)" % (i1, i2)) E AssertionError: T1: obj1.i (3) != obj2.i (2) lib/tests/testprog/zopenrace.py:156: AssertionError
-
Kirill Smelkov authored
Previously if an assert or something failed in spawned thread, the main thread was usually spinning indefinitely = tests hang. -> Switch all threading places to use sync.WorkGroup and this way if a thread fails, all other threads are canceled and the exception is reported back to wg.wait in main thread. Since we start to go this route, NotifyChannel is reworked to fully use channels instead of busy-waiting.
-
- 26 Mar, 2021 1 commit
-
-
Kirill Smelkov authored
setuptools_dso 2 started to emit those autogenerated files. See https://github.com/mdavidsaver/setuptools_dso/pull/15 for details.
-
- 08 Mar, 2021 3 commits
-
-
Kirill Smelkov authored
NEO 1.9 was released in 2018 and is outdated by now. NEO 1.12 is currently the latest NEO release.
-
Kirill Smelkov authored
After switching to ZODB >= 4 in the previous commit, we can safely require zodbtools, because there is now no conflict in between ZODB3/ZODB eggs.
-
Kirill Smelkov authored
It's been a while since last ZODB3 3.10.7 release in 2016 and the last commit in upstream ZODB3 repository (3.10 branch) is from 2017. The world switched since then to ZODB4 and to ZODB5 after that. We were still requiring ZODB3, because ZODB3 3.11 egg was just a dependency on newer ZODB, ZEO, BTrees and persistent; and this way we could be supporting all ZODB3.10.x and ZODB4 and ZODB5 via ZODB3.11. However upcoming Wendelin.core 2, for its proper working, needs MVCC semantic as implemented in ZODB5. This forces us, even for ZODB4, to backport non-trivial bits from ZODB5 (see [1]). Maintaining ZODB3 support at this point becomes non-practical, because, to our knowledge, there is no wendelin.core user that plans to continue using ZODB3 without switching to at least ZODB4 in the near future. So goodbye ZODB3. Even though ZODB still stays with us, it gives a feeling similar to [2], because in 2014, when I was myself learning ZODB, it was through ZODB3 - still at the time when all ZODB bits were living together in one place. [1] ZODB!1 [2] https://lists.osuosl.org/pipermail/darcs-users/2008-September/014095.html
-
- 11 Dec, 2020 1 commit
-
-
Kirill Smelkov authored
DB.close() does `del self.storage`. https://github.com/zopefoundation/ZODB/blob/5.6.0-14-g0eae10cd0/src/ZODB/DB.py#L646 This way if DB was closed, but some conn(s) were not, it will crash in teardown as e.g. below: _____________ ERROR at teardown of test_bigfile_zblk1_zdata_reuse ______________ def teardown_module(): > testdb.teardown() bigfile/tests/test_filezodb.py:58: _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ self = <wendelin.lib.testing.TestDB_ZEO object at 0x7fb9c0216350> def teardown(self): # close connections that test code forgot to close for connref, tb in self.connv: conn = connref() if conn is None: continue if not conn.opened: continue # still alive, but closed print("W: testdb: teardown: %s left not closed by test code" "; opened by:\n%s" % (conn, tb), file=sys.stderr) db = conn.db() > stor = db.storage E AttributeError: 'DB' object has no attribute 'storage' lib/testing.py:217: AttributeError The fix is simple - don't use db.storage at all, because it is not actually used in that code.
-
- 17 Nov, 2020 2 commits
-
-
Kirill Smelkov authored
Since ZBigFile keeps references to fileh objects that are created through it it forms a file <=> fileh cycle that is not collected without cyclic GC: https://lab.nexedi.com/nexedi/wendelin.core/blob/v0.13-52-ga702d41/bigfile/file_zodb.py#L497 https://lab.nexedi.com/nexedi/wendelin.core/blob/v0.13-52-ga702d41/bigfile/file_zodb.py#L566-571 We did not noticed this leak until now because it is small, but with upcoming wendelin.core 2 it is important to release a fileh, because there is WCFS connection associated with fileh, and if fileh is not released, that connection also stays alive, keeping on-WCFS resources still being used, and preventing WCFS from being unmounted cleanly. -> Add cyclic GC support to PyBigFile / PyBigFileH NOTE: we still don't allow PyVMA <=> PyBigFileH cycles to be collected, because fileh_close called from fileh.__del__ asserts that there are no live mappings left. See added comments for details. There is no known practical need to use such cycles, so this should be ok. See also other patches on cyclic GC topic: - 450ad804 (bigarray: ArrayRef support for BigArray) // adds cyclic GC support for PyVMA - d97641d2 (bigfile/py: Properly untrack PyVMA from GC before dealloc) /proposed-for-review-on nexedi/wendelin.core!12
-
Kirill Smelkov authored
The logic in pyvma_traverse and pyvma_clear needs to be synchronized with PyVMA deallocation. In the next patche we'll be amending this logic, and it will help a reader to keep all those functions together. For the reference: PyVMA support for cyclic GC was introduced in 450ad804 (bigarray: ArrayRef support for BigArray). See also d97641d2 (bigfile/py: Properly untrack PyVMA from GC before dealloc). /proposed-for-review-on nexedi/wendelin.core!12
-
- 03 Nov, 2020 2 commits
-
-
Kirill Smelkov authored
Otherwise when /bin/sh is dash it fails with t/tfault-run: 35: test: on_pagefault: unexpected operator
-
Kirill Smelkov authored
Otherwise, if previous test.fault failed, tfault-run fails to start, e.g. >>> test.fault $ make test.fault # MAKEFLAGS=-j1 x86_64-linux-gnu-gcc -pthread -g -Wall -D_GNU_SOURCE -std=gnu99 -fplan9-extensions -Wno-declaration-after-statement -Wno-error=declaration-after-statement -Iinclude -I3rdparty/ccan -I3rdparty/include bigfile/tests/tfault.c lib/bug.c lib/utils.c 3rdparty/ccan/ccan/tap/tap.c -o bigfile/tests/tfault.t t/tfault-run bigfile/tests/tfault.t faultr on_pagefault mkdir: cannot create directory ‘t/tfault-run.faultr’: File exists Makefile:186: recipe for target 'faultr.tfault' failed make: *** [faultr.tfault] Error 1 rm bigfile/tests/tfault.t error test.fault 0.433s # 1t 1e 0f 0s
-
- 02 Nov, 2020 1 commit
-
-
Kirill Smelkov authored
Nxdtest[1] is tox-like tool to run tests under Nexedi testing infrastructure. See [2] for details. [1] https://lab.nexedi.com/nexedi/nxdtest [2] nexedi/slapos!839
-
- 11 Sep, 2020 1 commit
-
-
Kirill Smelkov authored
We need the following patch of mine: http://git.ozlabs.org/?p=ccan;a=commitdiff;h=b97c7f0841f5173a07a2571f2c99f944d8405a90
-
- 17 May, 2020 2 commits
-
-
Kirill Smelkov authored
-
Kirill Smelkov authored
It is hard for people to understand current wording, so let's expand zconn_at description to additionally explain what it is providing with second set of words, which, hopefully, lowers potential ambiguity a bit. /reported-by @jwolf083
-
- 17 Apr, 2020 1 commit
-
-
Kirill Smelkov authored
Pygolang egg provides "golang" python package, not "pygolang". Bug introduced in 5c8340d2 (*: Use defer for dbclose & friends).
-
- 15 Apr, 2020 1 commit
-
-
Kirill Smelkov authored
In PEP517 mode setup.py is sourced - not executed - and the build fails with ImportError like this: Preparing wheel metadata ... error ERROR: Command errored out with exit status 1: command: /home/kirr/src/wendelin/venv/z-dev/bin/python2 /home/kirr/src/wendelin/venv/z-dev/local/lib/python2.7/site-packages/pip/_vendor/pep517/_in_process.py prepare_metadata_for_build_wheel /tmp/tmp2F3aEs cwd: /home/kirr/src/wendelin/wendelin.core Complete output (53 lines): running dist_info creating /tmp/pip-modern-metadata-sPiqUt/wendelin.core.egg-info writing requirements to /tmp/pip-modern-metadata-sPiqUt/wendelin.core.egg-info/requires.txt writing /tmp/pip-modern-metadata-sPiqUt/wendelin.core.egg-info/PKG-INFO writing top-level names to /tmp/pip-modern-metadata-sPiqUt/wendelin.core.egg-info/top_level.txt writing dependency_links to /tmp/pip-modern-metadata-sPiqUt/wendelin.core.egg-info/dependency_links.txt writing entry points to /tmp/pip-modern-metadata-sPiqUt/wendelin.core.egg-info/entry_points.txt writing manifest file '/tmp/pip-modern-metadata-sPiqUt/wendelin.core.egg-info/SOURCES.txt' package init file '__init__.py' not found (or not a regular file) Traceback (most recent call last): File "/home/kirr/src/wendelin/venv/z-dev/local/lib/python2.7/site-packages/pip/_vendor/pep517/_in_process.py", line 257, in <module> main() File "/home/kirr/src/wendelin/venv/z-dev/local/lib/python2.7/site-packages/pip/_vendor/pep517/_in_process.py", line 240, in main json_out['return_val'] = hook(**hook_input['kwargs']) File "/home/kirr/src/wendelin/venv/z-dev/local/lib/python2.7/site-packages/pip/_vendor/pep517/_in_process.py", line 110, in prepare_metadata_for_build_wheel return hook(metadata_directory, config_settings) File "/home/kirr/src/wendelin/venv/z-dev/local/lib/python2.7/site-packages/setuptools/build_meta.py", line 155, in prepare_metadata_for_build_wheel self.run_setup() File "/home/kirr/src/wendelin/venv/z-dev/local/lib/python2.7/site-packages/setuptools/build_meta.py", line 234, in run_setup self).run_setup(setup_script=setup_script) File "/home/kirr/src/wendelin/venv/z-dev/local/lib/python2.7/site-packages/setuptools/build_meta.py", line 141, in run_setup exec(compile(code, __file__, 'exec'), locals()) File "setup.py", line 374, in <module> """.splitlines()] File "/home/kirr/src/tools/go/pygolang/golang/pyx/build.py", line 118, in setup setuptools_dso.setup(**kw) File "/home/kirr/src/tools/py/pypa/setuptools_dso/src/setuptools_dso/__init__.py", line 37, in setup _setup(**kws) File "/home/kirr/src/wendelin/venv/z-dev/local/lib/python2.7/site-packages/setuptools/__init__.py", line 145, in setup return distutils.core.setup(**attrs) File "/usr/lib/python2.7/distutils/core.py", line 151, in setup dist.run_commands() File "/usr/lib/python2.7/distutils/dist.py", line 953, in run_commands self.run_command(cmd) File "/usr/lib/python2.7/distutils/dist.py", line 972, in run_command cmd_obj.run() File "/home/kirr/src/wendelin/venv/z-dev/local/lib/python2.7/site-packages/setuptools/command/dist_info.py", line 31, in run egg_info.run() File "/home/kirr/src/wendelin/venv/z-dev/local/lib/python2.7/site-packages/setuptools/command/egg_info.py", line 296, in run self.find_sources() File "/home/kirr/src/wendelin/venv/z-dev/local/lib/python2.7/site-packages/setuptools/command/egg_info.py", line 303, in find_sources mm.run() File "/home/kirr/src/wendelin/venv/z-dev/local/lib/python2.7/site-packages/setuptools/command/egg_info.py", line 534, in run self.add_defaults() File "/home/kirr/src/wendelin/venv/z-dev/local/lib/python2.7/site-packages/setuptools/command/egg_info.py", line 574, in add_defaults rcfiles = list(walk_revctrl()) File "/home/kirr/src/wendelin/venv/z-dev/local/lib/python2.7/site-packages/setuptools/command/sdist.py", line 20, in walk_revctrl for item in ep.load()(dirname): File "/home/kirr/src/wendelin/venv/z-dev/local/lib/python2.7/site-packages/pkg_resources/__init__.py", line 2434, in load return self.resolve() File "/home/kirr/src/wendelin/venv/z-dev/local/lib/python2.7/site-packages/pkg_resources/__init__.py", line 2444, in resolve raise ImportError(str(exc)) ImportError: 'module' object has no attribute 'git_lsfiles' See comments added to register_as_entrypoint for explanation of what happens. Wendelin.core will soon switch to PEP517 mode (by adding pyproject.toml) to build-require Cython, Pygolang and friends.
-