wcfs: xbtree: BTree-diff algorithm

This algorithm will be internally used by ΔBtail in the next patch. The algorithm would be simple, if we would need to diff two trees completely. However in ΔBtail only subpart of BTree nodes are tracked(*) and the diff has to work modulo that tracking set. No tests now because ΔBtail tests will cover treediff functionality as well. Some preliminary history: 78f2f88b X wcfs/xbtree: Fix treediff(a, ø) 5324547c X wcfs/xbtree: root(a) must stay in trackSet even after treediff(a,ø) f65f775b X wcfs/xbtree: treediff(ø, b) c75b1c6f X wcfs/xbtree: Start killing holeIdx ef5e5183 X treediff ret += δtkeycov 9d20f8e8 X treediff: Fix BUG while computing AB coverage ddb28043 X rebuild: Don't return nil for empty ΔPPTreeSubSet - that leads to SIGSEGV f68398c9 X wcfs: Move treediff into its own file (*) because full BTree scan is needed to discover all of its nodes. Quoting treediff documentation: ---- 8< ---- treediff provides diff for BTrees Use δZConnectTracked + treediff to compute BTree-diff caused by δZ: δZConnectTracked(δZ, trackSet) -> δZTC, δtopsByRoot treediff(root, δtops, δZTC, trackSet, zconn{Old,New}) -> δT, δtrack, δtkeycov δZConnectTracked computes BTree-connected closure of δZ modulo tracked set and also returns δtopsByRoot to indicate which tree objects were changed and in which subtree parts. With that information one can call treediff for each changed root to compute BTree-diff and δ for trackSet itself. BTree diff algorithm diffT, diffB and δMerge constitute the diff algorithm implementation. diff(A,B) works on pair of A and B whole key ranges splitted into regions covered by tree nodes. The splitting represents current state of recursion into corresponding tree. If a node in particular key range is Bucket, that bucket contributes to δ- in case of A, and to δ+ in case of B. If a node in particular key range is Tree, the algorithm may want to expand that tree node into its children and to recourse into some of the children. There are two phases: - Phase 1 expands A top->down driven by δZTC, adds reached buckets to δ-, and queues key regions of those buckets to be processed on B. - Phase 2 starts processing from queued key regions, expands them on B and adds reached buckets to δ+. Then it iterates to reach consistency in between A and B because processing buckets on B side may increase δ key coverage, and so corresponding key ranges has to be again processed on A. Which in turn may increase δ key coverage again, and needs to be processed on B side, etc... The final δ is merge of δ- and δ+. diffT has more detailed explanation of phase 1 and phase 2 logic.

wcfs: xbtree: BTree-diff algorithm
This algorithm will be internally used by ΔBtail in the next patch. The algorithm would be simple, if we would need to diff two trees completely. However in ΔBtail only subpart of BTree nodes are tracked(*) and the diff has to work modulo that tracking set. No tests now because ΔBtail tests will cover treediff functionality as well. Some preliminary history: 78f2f88b X wcfs/xbtree: Fix treediff(a, ø) 5324547c X wcfs/xbtree: root(a) must stay in trackSet even after treediff(a,ø) f65f775b X wcfs/xbtree: treediff(ø, b) c75b1c6f X wcfs/xbtree: Start killing holeIdx ef5e5183 X treediff ret += δtkeycov 9d20f8e8 X treediff: Fix BUG while computing AB coverage ddb28043 X rebuild: Don't return nil for empty ΔPPTreeSubSet - that leads to SIGSEGV f68398c9 X wcfs: Move treediff into its own file (*) because full BTree scan is needed to discover all of its nodes. Quoting treediff documentation: ---- 8< ---- treediff provides diff for BTrees Use δZConnectTracked + treediff to compute BTree-diff caused by δZ: δZConnectTracked(δZ, trackSet) -> δZTC, δtopsByRoot treediff(root, δtops, δZTC, trackSet, zconn{Old,New}) -> δT, δtrack, δtkeycov δZConnectTracked computes BTree-connected closure of δZ modulo tracked set and also returns δtopsByRoot to indicate which tree objects were changed and in which subtree parts. With that information one can call treediff for each changed root to compute BTree-diff and δ for trackSet itself. BTree diff algorithm diffT, diffB and δMerge constitute the diff algorithm implementation. diff(A,B) works on pair of A and B whole key ranges splitted into regions covered by tree nodes. The splitting represents current state of recursion into corresponding tree. If a node in particular key range is Bucket, that bucket contributes to δ- in case of A, and to δ+ in case of B. If a node in particular key range is Tree, the algorithm may want to expand that tree node into its children and to recourse into some of the children. There are two phases: - Phase 1 expands A top->down driven by δZTC, adds reached buckets to δ-, and queues key regions of those buckets to be processed on B. - Phase 2 starts processing from queued key regions, expands them on B and adds reached buckets to δ+. Then it iterates to reach consistency in between A and B because processing buckets on B side may increase δ key coverage, and so corresponding key ranges has to be again processed on A. Which in turn may increase δ key coverage again, and needs to be processed on B side, etc... The final δ is merge of δ- and δ+. diffT has more detailed explanation of phase 1 and phase 2 logic.
b7b59e20 · Kirill Smelkov · ce84b07f · b7b59e20 · b7b59e20 · b7b59e20
Commit b7b59e20 authored Oct 26, 2021 by Kirill Smelkov
4 changed files
--- a/wcfs/internal/xbtree/treediff.go
+++ b/wcfs/internal/xbtree/treediff.go
--- a/wcfs/internal/xbtree/xbtree.go
+++ b/wcfs/internal/xbtree/xbtree.go
+// Copyright (C) 2021  Nexedi SA and Contributors.
+//                     Kirill Smelkov <kirr@nexedi.com>
+//
+// This program is free software: you can Use, Study, Modify and Redistribute
+// it under the terms of the GNU General Public License version 3, or (at your
+// option) any later version, as published by the Free Software Foundation.
+//
+// You can also Link and Combine this program with other software covered by
+// the terms of any of the Free Software licenses or any of the Open Source
+// Initiative approved licenses and Convey the resulting work. Corresponding
+// source of such a combination shall include the source code for all other
+// software used.
+//
+// This program is distributed WITHOUT ANY WARRANTY; without even the implied
+// warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.
+//
+// See COPYING file for full licensing terms.
+// See https://www.nexedi.com/licensing for rationale and options.
+
+// Package xbtree complements package lab.nexedi.com/kirr/neo/go/zodb/btree.
+//
+// It provides the following amendments:
+//
+// - ΔBtail (tail of revisional changes to BTrees).
+package xbtree
+
+// this file contains only tree types and utilities.
+// main code lives in δbtail.go and treediff.go .
+
+import (
+	"context"
+	"fmt"
+
+	"lab.nexedi.com/kirr/go123/xerr"
+	"lab.nexedi.com/kirr/neo/go/zodb"
+
+	"lab.nexedi.com/nexedi/wendelin.core/wcfs/internal/set"
+	"lab.nexedi.com/nexedi/wendelin.core/wcfs/internal/xbtree/blib"
+	"lab.nexedi.com/nexedi/wendelin.core/wcfs/internal/xzodb"
+)
+
+// XXX instead of generics
+type Tree   = blib.Tree
+type Bucket = blib.Bucket
+type Node   = blib.Node
+type TreeEntry   = blib.TreeEntry
+type BucketEntry = blib.BucketEntry
+
+type Key      = blib.Key
+type KeyRange = blib.KeyRange
+const KeyMax  = blib.KeyMax
+const KeyMin  = blib.KeyMin
+
+// value is assumed to be persistent reference.
+// deletion is represented as VDEL.
+type Value  = zodb.Oid
+const VDEL  = zodb.InvalidOid
+
+type setOid = set.Oid
+
+
+// pathEqual returns whether two paths are the same.
+func pathEqual(patha, pathb []zodb.Oid) bool {
+	if len(patha) != len(pathb) {
+		return false
+	}
+	for i, a := range patha {
+		if pathb[i] != a {
+			return false
+		}
+	}
+	return true
+}
+
+// vnode returns brief human-readable representation of node.
+func vnode(node Node) string {
+	kind := "?"
+	switch node.(type) {
+	case *Tree:   kind = "T"
+	case *Bucket: kind = "B"
+	}
+	return kind + node.POid().String()
+}
+
+// zgetNodeOrNil returns btree node corresponding to zconn.Get(oid) .
+// if the node does not exist, (nil, ok) is returned.
+func zgetNodeOrNil(ctx context.Context, zconn *zodb.Connection, oid zodb.Oid) (node Node, err error) {
+	defer xerr.Contextf(&err, "getnode %s@%s", oid, zconn.At())
+	xnode, err := xzodb.ZGetOrNil(ctx, zconn, oid)
+	if xnode == nil || err != nil {
+		return nil, err
+	}
+
+	node, ok := xnode.(Node)
+	if !ok {
+		return nil, fmt.Errorf("unexpected type: %s", zodb.ClassOf(xnode))
+	}
+	return node, nil
+}
+
+
+func panicf(format string, argv ...interface{}) {
+	panic(fmt.Sprintf(format, argv...))
+}
--- a/wcfs/internal/xbtree/xbtreetest/xbtreetest.go
+++ b/wcfs/internal/xbtree/xbtreetest/xbtreetest.go
@@ -43,6 +43,7 @@ const KeyMin  = blib.KeyMin

 type setKey = set.I64

+// XXX dup from xbtree  (to avoid import cycle)
 const VDEL  = zodb.InvalidOid



--- a/wcfs/internal/xzodb/xzodb.go
+++ b/wcfs/internal/xzodb/xzodb.go
@@ -22,9 +22,12 @@ package xzodb

 import (
 	"context"
+	"errors"
 	"fmt"
+	"reflect"

 	"lab.nexedi.com/kirr/go123/xcontext"
+	"lab.nexedi.com/kirr/go123/xerr"

 	"lab.nexedi.com/kirr/neo/go/transaction"
 	"lab.nexedi.com/kirr/neo/go/zodb"
@@ -80,3 +83,54 @@ func ZOpen(ctx context.Context, zdb *zodb.DB, zopt *zodb.ConnOptions) (_ *ZConn,
 		TxnCtx:     txnCtx,
 	}, nil
 }
+
+// ZGetOrNil returns zconn.Get(oid), or (nil,ok) if the object does not exist.
+func ZGetOrNil(ctx context.Context, zconn *zodb.Connection, oid zodb.Oid) (_ zodb.IPersistent, err error) {
+	defer xerr.Contextf(&err, "zget %s@%s", oid, zconn.At())
+	obj, err := zconn.Get(ctx, oid)
+	if err != nil {
+		if IsErrNoData(err) {
+			err = nil
+		}
+		return nil, err
+	}
+
+	// activate the object to find out it really exists
+	// after removal on storage, the object might have stayed in Connection
+	// cache due to e.g. PCachePinObject, and it will be PActivate that
+	// will return "deleted" error.
+	err = obj.PActivate(ctx)
+	if err != nil {
+		if IsErrNoData(err) {
+			return nil, nil
+		}
+		return nil, err
+	}
+	obj.PDeactivate()
+
+	return obj, nil
+}
+
+// IsErrNoData returns whether err is due to NoDataError or NoObjectError.
+func IsErrNoData(err error) bool {
+	var eNoData   *zodb.NoDataError
+	var eNoObject *zodb.NoObjectError
+
+	switch {
+	case errors.As(err, &eNoData):
+		return true
+	case errors.As(err, &eNoObject):
+		return true
+	default:
+		return false
+	}
+}
+
+// XidOf returns string representation of object xid.
+func XidOf(obj zodb.IPersistent) string {
+	if obj == nil || reflect.ValueOf(obj).IsNil() {
+		return "ø"
+	}
+	xid := zodb.Xid{At: obj.PJar().At(), Oid: obj.POid()}
+	return xid.String()
+}