wcfs: tests: xbtree.py package for inspecting/manipulating internal structure of BTrees
To handle invalidations, WCFS will need to detect changes to both ZBlk objects and to ZBigFile.blktab BTree that is mapping file blocks to ZBlk objects. And with BTree detecting changes is much more complex, because when a BTree changes, it might be rebalanced, or keys migrated from one tree/bucket node to another tree/bucket node. In other words a BTree change might be not only a change to a {}key->value dictionary, but also a change to BTree topology. Because there are many BTree topologies that correspond to the same {}key->value state, a change from kv₁ to kv₂, even if kv₁ and kv₂ are close to each other, might be accompanied by a dramatic change to topology of the tree. This creates a need for thoroughly testing the BTree difference algorithm because many of BTree topologies changes are tricky, and if a simple algorithm works on relatively stable topology updates, it does not necessarily mean that that same algorithm will continue to work correctly in the general case. So, as a preparatory step, here comes xbtree.py package, that can be used to inspect tree topologies, to create trees with specified topology and to manipulate topology of an existing tree. This package will be used in tests for upcoming ΔBtail. For debugging, and also since those tests will involve both Go and Python parts, it creates the need to be able to specify and exchange topology of a tree via compact string. This package also defines so called "topology encoding" to do so. Some preliminar history: kirr/wendelin.core@fb56193f X fix metric to keep Z <- N order stable over key^ kirr/wendelin.core@809304d1 X "B:" indicates ø bucket with k&b, "B" - ø bucket with only keys kirr/wendelin.core@9eca74ec X Teach AllStructs to emit topologies with values kirr/wendelin.core@1b962f03 X Restructure: found bug that it was not marking objects as modified kirr/wendelin.core@9181c5d9 X Restructure; verify that it marks as changed only modifed nodes kirr/wendelin.core@e9902c4a X improve `xbtree topoview` For the reference xbtree.py package documentation is quoted below. ---- 8< ---- Package xbtree provides utilities for inspecting/manipulating internal structure of integer-keyed BTrees. It will be primarily used to help verify ΔBTail in WCFS. - `Tree` represents a tree node. - `Bucket` represents a bucket node. - `StructureOf` returns internal structure of ZODB BTree represented as Tree and Bucket nodes. - `Restructure` reorganizes ZODB BTree instance according to specified topology structure. - `AllStructs` generates all possible BTree topology structures with given keys. Topology encoding ----------------- Topology encoding provides way to represent structure of a Tree as path-like string. TopoEncode converts Tree into its topology-encoded representation, while TopoDecode decodes topology-encoded string back into Tree. The following example illustrates topology encoding represented by string "T3/T-T/B1-T5/B-B7,8,9": [ 3 ] T3/ represents Tree([3]) / \ [ ] [ ] T-T/ represents two empty Tree([]) ↓ ↓ |1|[ 5 ] B1-T5/ represent Bucket([1]) and Tree([5]) / \ || |7|8|9| B-B7,8,9 represents empty Bucket([]) and Bucket([7,8,9]) Topology encoding specification: A Tree is encoded by level-order traversal, delimiting layers with "/". Inside a layer Tree and Bucket nodes are signalled as "T<keys>" ; Tree "B<keys>" ; Bucket with only keys "B<keys+values>" ; Bucket with keys and values Keys are represented as ","-delimited list of integers. For example Tree or Bucket with [1,3,5] keys are represented as "T1,3,5" ; Tree([1,3,5]) "B1,3,5" ; Bucket([1,3,5]) Keys+values are represented as ","-delimited list of "<key>:<value>" pairs. For example Bucket corresponding to {1:1, 2:4, 3:9} is represented as "B1:1,2:4,3:9" ; Bucket([1,2,3], [1,4,9]) Empty keys+values are represented as ":" - an empty Bucket for key->value mapping is represented as "B:" ; Bucket([], []) Nodes inside one layer are delimited with "-". For example a layer consisting of an empty Tree, a Tree with [1,3] keys, and Bucket with [4,5] keys is represented as "T-T1,3-B4,5" ; layer with Tree([]), Tree([1,3]) and Bucket([4,5]) A layer consists of nodes that are followed by node-node links from upper layer in left-to-right order. Visualization ------------- The following visualization utilities are provided to help understand BTrees better: - `topoview` displays BTree structure given its topology-encoded representation. - `Tree.graphviz` returns Tree graph representation in dot language.
Showing
wcfs/internal/xbtree.py
0 → 100644
wcfs/internal/xbtree_test.py
0 → 100644