fsrefs: Optimize IO (take 2) (#340)

* fsrefs: Optimize IO (take 2) Access objects in the order of their position in file instead of in the order of their OID. This should give dramatical speedup when data are on HDD. For example @perrinjerome reports that on a 73Go database it takes almost 8h to run fsrefs (where on the same database, fstest takes 15 minutes) [1,2]. After the patch fsrefs took ~80 minutes to run on the same database. In other words this is ~ 6x improvement. Fsrefs has no tests. I tested it only lightly via generating a bit corrupt database with deleted referred object(*), and it gives the same output as unmodified fsrefs. oid 0x0 __main__.Object last updated: 1979-01-03 21:00:42.900001, tid=0x285cbacb70a3db3 refers to invalid objects: oid 0x07 missing: '<unknown>' oid 0x07 object creation was undone: '<unknown>' This "take 2" version is derived from https://github.com/zopefoundation/ZODB/pull/338 and only iterates objects in the order of their in-file position without building complete references graph in-RAM, because that in-RAM graph would consume ~12GB of memory. Added pos2oid in-RAM index also consumes memory: for the 73GB database in question fs._index takes ~700MB, while pos2oid takes ~2GB. In theory it could be less, because we need only array of oid sorted by key(oid)=fs._index[oid]. However array.array does not support sorting, and if we use plain list to keep just []oid, the memory consumption just for that list is ~5GB. Also because list.sort(key=...) internally allocates memory for key array (and list.sort(cmp=...) was removed from Python3), total memory consumption just to produce list of []oid ordered by pos is ~10GB. So without delving into C/Cython and/or manually sorting the array in Python (= slow), using QQBTree seems to be the best out-of-the-box option for oid-by-pos index. [1] nexedi/zodbtools!19 (comment 129480) [2] nexedi/zodbtools!19 (comment 129551) (*) test database generated via a bit modified gen_testdata.py from zodbtools: https://lab.nexedi.com/nexedi/zodbtools/blob/v0.0.0.dev8-28-g129afa6/zodbtools/test/gen_testdata.py + ```diff --- a/zodbtools/test/gen_testdata.py +++ b/zodbtools/test/gen_testdata.py @@ -229,7 +229,7 @@ def ext(subj): return {} # delete an object name = random.choice(list(root.keys())) obj = root[name] - root[name] = Object("%s%i*" % (name, i)) +# root[name] = Object("%s%i*" % (name, i)) # NOTE user/ext are kept empty on purpose - to also test this case commit(u"", u"predelete %s" % unpack64(obj._p_oid), {}) ``` /cc @tim-one, @jeremyhylton, @jamadden /reviewed-by @jamadden, @perrinjerome /reviewed-on https://github.com/zopefoundation/ZODB/pull/340

fsrefs: Optimize IO (take 2) (#340)
* fsrefs: Optimize IO (take 2) Access objects in the order of their position in file instead of in the order of their OID. This should give dramatical speedup when data are on HDD. For example @perrinjerome reports that on a 73Go database it takes almost 8h to run fsrefs (where on the same database, fstest takes 15 minutes) [1,2]. After the patch fsrefs took ~80 minutes to run on the same database. In other words this is ~ 6x improvement. Fsrefs has no tests. I tested it only lightly via generating a bit corrupt database with deleted referred object(*), and it gives the same output as unmodified fsrefs. oid 0x0 __main__.Object last updated: 1979-01-03 21:00:42.900001, tid=0x285cbacb70a3db3 refers to invalid objects: oid 0x07 missing: '<unknown>' oid 0x07 object creation was undone: '<unknown>' This "take 2" version is derived from https://github.com/zopefoundation/ZODB/pull/338 and only iterates objects in the order of their in-file position without building complete references graph in-RAM, because that in-RAM graph would consume ~12GB of memory. Added pos2oid in-RAM index also consumes memory: for the 73GB database in question fs._index takes ~700MB, while pos2oid takes ~2GB. In theory it could be less, because we need only array of oid sorted by key(oid)=fs._index[oid]. However array.array does not support sorting, and if we use plain list to keep just []oid, the memory consumption just for that list is ~5GB. Also because list.sort(key=...) internally allocates memory for key array (and list.sort(cmp=...) was removed from Python3), total memory consumption just to produce list of []oid ordered by pos is ~10GB. So without delving into C/Cython and/or manually sorting the array in Python (= slow), using QQBTree seems to be the best out-of-the-box option for oid-by-pos index. [1] nexedi/zodbtools!19 (comment 129480) [2] nexedi/zodbtools!19 (comment 129551) (*) test database generated via a bit modified gen_testdata.py from zodbtools: https://lab.nexedi.com/nexedi/zodbtools/blob/v0.0.0.dev8-28-g129afa6/zodbtools/test/gen_testdata.py + ```diff --- a/zodbtools/test/gen_testdata.py +++ b/zodbtools/test/gen_testdata.py @@ -229,7 +229,7 @@ def ext(subj): return {} # delete an object name = random.choice(list(root.keys())) obj = root[name] - root[name] = Object("%s%i*" % (name, i)) +# root[name] = Object("%s%i*" % (name, i)) # NOTE user/ext are kept empty on purpose - to also test this case commit(u"", u"predelete %s" % unpack64(obj._p_oid), {}) ``` /cc @tim-one, @jeremyhylton, @jamadden /reviewed-by @jamadden, @perrinjerome /reviewed-on https://github.com/zopefoundation/ZODB/pull/340
79078049 · Kirill Smelkov · GitHub · 22d1405d · 79078049 · 79078049
Commit 79078049 authored Mar 29, 2021 by Kirill Smelkov Committed by GitHub Mar 29, 2021
Show whitespace changes
Inline Side-by-side

Showing with 21 additions and 3 deletions

CHANGES.rst CHANGES.rst +2 -0

src/ZODB/scripts/fsrefs.py src/ZODB/scripts/fsrefs.py +19 -3

No files found.
--- a/CHANGES.rst
+++ b/CHANGES.rst
@@ -8,6 +8,8 @@
 - Fix UnboundLocalError when running fsoids.py script.
  See `issue 268 <https://github.com/zopefoundation/ZODB/issues/285>`_.

+- Rework ``fsrefs`` script to work significantly faster by optimizing how it does
+  IO. See `PR 340 <https://github.com/zopefoundation/ZODB/pull/340>`_.

 5.6.0 (2020-06-11)
 ==================

--- a/src/ZODB/scripts/fsrefs.py
+++ b/src/ZODB/scripts/fsrefs.py
@@ -66,9 +66,10 @@ import traceback

 from ZODB.FileStorage import FileStorage
 from ZODB.TimeStamp import TimeStamp
-from ZODB.utils import u64, oid_repr, get_pickle_metadata, load_current
+from ZODB.utils import u64, p64, oid_repr, get_pickle_metadata, load_current
 from ZODB.serialize import get_refs
 from ZODB.POSException import POSKeyError
+from BTrees.QQBTree import QQBTree

 # There's a problem with oid.  'data' is its pickle, and 'serial' its
 # serial number.  'missing' is a list of (oid, class, reason) triples,
@@ -118,7 +119,18 @@ def main(path=None):
    # This does not include oids in undone.
    noload = {}

-    for oid in fs._index.keys():
+    # build {pos -> oid} index that is reverse to {oid -> pos} fs._index
+    # we'll need this to iterate objects in order of ascending file position to
+    # optimize disk IO.
+    pos2oid = QQBTree() # pos -> u64(oid)
+    for oid, pos in fs._index.iteritems():
+        pos2oid[pos] = u64(oid)
+
+    # pass 1: load all objects listed in the index and remember those objects
+    # that are deleted or load with an error. Iterate objects in order of
+    # ascending file position to optimize disk IO.
+    for oid64 in pos2oid.itervalues():
+        oid = p64(oid64)
        try:
            data, serial = load_current(fs, oid)
        except (KeyboardInterrupt, SystemExit):
@@ -130,9 +142,13 @@ def main(path=None):
                traceback.print_exc()
            noload[oid] = 1

+    # pass 2: go through all objects again and verify that their references do
+    # not point to problematic object set. Iterate objects in order of ascending
+    # file position to optimize disk IO.
    inactive = noload.copy()
    inactive.update(undone)
-    for oid in fs._index.keys():
+    for oid64 in pos2oid.itervalues():
+        oid = p64(oid64)
        if oid in inactive:
            continue
        data, serial = load_current(fs, oid)