1. 19 Jun, 2024 1 commit
    • Levin Zimmermann's avatar
      Add zodbtraverse to inspect & compare which OIDs are load- & reachable · eeb5aea2
      Levin Zimmermann authored
      With zodbtraverse a user can traverse a ZODB database graph from its root to
      find all reach- and loadable OIDs. It only finds OIDs that are both,
      reachable and loadable, but doesn't dump OIDs that are loadable but not
      reachable (orphaned objects) and doesn't dump OIDs that are reachable
      but not loadable (broken or corrupt objects). The found OIDs are stored
      in a SQLite database. After traversing a ZODB database twice, the found
      OIDs can be compared to see if the ZODB database differs.
      
      This tool was developed to see if the same objects of a ZODB database are
      still reach- & loadable after changes were applied to its storage. It
      helps to guarantee the integrity of the data (e.g. to ensure the storage
      changes don't lead to a data loss).
      eeb5aea2
  2. 16 Feb, 2024 1 commit
  3. 01 Sep, 2023 3 commits
    • Jérome Perrin's avatar
      zodbcommit: include the status of transaction · a9853038
      Jérome Perrin authored
      even though the interface of IStorageRestorable.tpc_begin does not
      have a "status" argument, it is described in the notes below that the
      actual implementation uses it:
      
      https://github.com/zopefoundation/ZODB/blob/0632974d/src/ZODB/interfaces.py#L950-L956
      
      This is used by FileStorage:
      
      https://github.com/zopefoundation/ZODB/blob/0632974d/src/ZODB/FileStorage/format.py#L30-L39
      
      and the storage methods seem to accept this argument:
      
      https://github.com/zopefoundation/ZODB/blob/0632974d/src/ZODB/BaseStorage.py#L182
      https://github.com/zopefoundation/ZEO/blob/e5637818/src/ZEO/ClientStorage.py#L888
      https://lab.nexedi.com/nexedi/neoppod/blob/fd87e153/neo/client/app.py#L473
      
      Propagating the status fixes some cases where restoring commits did not
      recreate a storage that is byte-to-byte equivalent. This happened with
      a FileStorage that was packed and contained transactions with "p"
      status.
      Co-authored-by: Kirill Smelkov's avatarKirill Smelkov <kirr@nexedi.com>
      Reviewed-on: nexedi/zodbtools!24
      a9853038
    • Kirill Smelkov's avatar
      test/gen_testdata: Generate transactions with both " " and "p" status · 1b480c93
      Kirill Smelkov authored
      Until now we were generating only regular transactions with " " status
      and this does not cover e.g. restore case when it needs to replicate
      packed transaction: instead of recreating it bit-to-bit exactly as
      original with "p" status, restore recreates it with " " status, breaking
      restore promise.
      
      Adjusting testdata this way exposes that bug in restore:
      
          ======================================== FAILURES ========================================
          ________________________________________ test_zodbrestore[!zext] ________________________________________
      
          tmpdir = local('/tmp/pytest-of-kirr/pytest-17/test_zodbrestore__zext_0'), zext = <function _ at 0x7fd6b7a03750>
      
              @func
              def test_zodbrestore(tmpdir, zext):
                  zkind = '_!zext' if zext.disabled else ''
      
                  # restore from testdata/1.zdump.ok and verify it gives result that is
                  # bit-to-bit identical to testdata/1.fs
                  tdata = dirname(__file__) + "/testdata"
                  @func
                  def _():
                      zdump = open("%s/1%s.zdump.raw.ok" % (tdata, zkind), 'rb')
                      defer(zdump.close)
      
                      stor = storageFromURL('%s/2.fs' % tmpdir)
                      defer(stor.close)
      
                      zodbrestore(stor, zdump)
                  _()
      
                  zfs1 = readfile(fs1_testdata_py23(tmpdir, "%s/1%s.fs" % (tdata, zkind)))
                  zfs2 = readfile("%s/2.fs" % tmpdir)
          >       assert zfs1 == zfs2
          E       assert 'FS21\x02\x85...0\x00\x00\xb2' == 'FS21\x02\x85\...0\x00\x00\xb2'
          E         Skipping 49 identical leading characters in diff, use -v to show
          E         Skipping 22871 identical trailing characters in diff, use -v to show
          E         - 0\x00\x00tp\x00\x08\x00\t\x00\x00user0.15step 0.15\x00\x00\x00\x00\x00\x00\x00\x03\x02\x85\xcb\xac\x83i\xd0f\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x04\x00\x00\x00\x00\x00\x00\x00\x00\x00"\x80\x02c__main__
          E         ?           ^
          E         + 0\x00\x00t \x00\x08\x00\t\x00\x00user0.15step 0.15\x00\x00\x00\x00\x00\x00\x00\x03\x02\x85\xcb\xac\x83i\xd0f\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x04\x00\x00\x00\x00\x00\x00\x00...
          E
          E         ...Full output truncated (39 lines hidden), use '-vv' to show
      
          test_restore.py:53: AssertionError
      
      Having "p" transactions in the testdata will also make sure that all tools
      should handle such transactions well.
      
      The problem of restore not handling "p" status properly was reported by Jérome
      at nexedi/zodbtools!24.
      
      In the next patch we will fix that problem.
      
      /reviewed-by @jerome
      /reviewed-on nexedi/zodbtools!24
      1b480c93
    • Kirill Smelkov's avatar
      test/gen_testdata: Adjust it to match current testdata/ state · 37786d10
      Kirill Smelkov authored
      In 80559a94 ("zodbdump: support --pretty option with a format to show
      pickles disassembly") we added support for zodbdump --pretty and
      adjusted files in testdata/ to be named like 1.zdump.{raw,zpickledis}.ok
      instead of just 1.zdump.ok. However, that renaming and
      generation of 1.zdump.zpickledis.ok, it seems, were done by hand, because
      rerunning gen_testdata.py still regenerates old 1.zdump.ok. It seems
      that during !22 I
      missed that gen_testdata.py was not updated.
      
      -> Fix it.
      
      Running gen_testdata.py with py2 and ZODB 5.8.1 regenerates *.fs and
      *.ok files in testdata/ in exactly the same state they were.
      
      /reviewed-by @jerome
      /reviewed-on !24
      37786d10
  4. 08 Sep, 2022 1 commit
  5. 07 Sep, 2022 7 commits
    • Kirill Smelkov's avatar
      analyze: test: Fix tidmin thinko in "empty range" test · 65ebbe7b
      Kirill Smelkov authored
      Empty-range test added in b4824ad5 (analyze: fix ZeroDivisionErrors when
      report is empty) intended to use 0xffffffffffffffff TID, but used just
      'ffffffffffffffff' string instead. It was passing on py2 partly by luck,
      but on py3 it fails because tidmin type is mismatched:
      
          _______________________________ test_zodbanalyze _______________________________
      
          tmpdir = local('/tmp/pytest-of-kirr/pytest-30/test_zodbanalyze0')
          capsys = <_pytest.capture.CaptureFixture object at 0x7f7bb3f9a4f0>
      
              def test_zodbanalyze(tmpdir, capsys):
                  ...
      
                  # empty range
                  report(
          >           analyze(
                          tfs1,
                          use_dbm=False,
                          delta_fs=False,
                          tidmin="ffffffffffffffff",
                          tidmax=None,
                      ),
                      csv=False,
                  )
      
          zodbtools/test/test_analyze.py:68:
          _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _
          ../../venv/py3.venv/lib/python3.9/site-packages/decorator.py:232: in fun
              return caller(func, *(extras + args), **kw)
          ../../../tools/go/pygolang/golang/__init__.py:103: in _
              return f(*argv, **kw)
          zodbtools/zodbanalyze.py:181: in analyze
              fsi = fs.iterator(tidmin, tidmax)
          ../ZODB/src/ZODB/FileStorage/FileStorage.py:1381: in iterator
              return FileIterator(self._file_name, start, stop)
          _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _
      
          self = <ZODB.FileStorage.FileStorage.FileIterator object at 0x7f7bb348c6d0>
          filename = '/tmp/pytest-of-kirr/pytest-30/test_zodbanalyze0/1.fs'
          start = 'ffffffffffffffff', stop = None, pos = 4
      
              def __init__(self, filename, start=None, stop=None, pos=4):
                  assert isinstance(filename, STRING_TYPES)
                  file = open(filename, 'rb')
                  self._file = file
                  self._file_name = filename
                  if file.read(4) != packed_version:
                      raise FileStorageFormatError(file.name)
                  file.seek(0, 2)
                  self._file_size = file.tell()
                  if (pos < 4) or pos > self._file_size:
                      raise ValueError("Given position is greater than the file size",
                                       pos, self._file_size)
                  self._pos = pos
          >       assert start is None or isinstance(start, bytes)
          E       AssertionError
      
          ../ZODB/src/ZODB/FileStorage/FileStorage.py:1816: AssertionError
          ------------------------------ Captured log call -------------------------------
          ERROR    ZODB.FileStorage:FileStorage.py:480 loading index
          UnicodeDecodeError: 'ascii' codec can't decode byte 0xb7 in position 25: ordinal not in range(128)
      
          The above exception was the direct cause of the following exception:
      
          Traceback (most recent call last):
            File "/home/kirr/src/wendelin/z/ZODB/src/ZODB/FileStorage/FileStorage.py", line 478, in _restore_index
              info = fsIndex.load(index_name)
            File "/home/kirr/src/wendelin/z/ZODB/src/ZODB/fsIndex.py", line 138, in load
              v = unpickler.load()
          SystemError: <built-in method read of _io.BufferedReader object at 0x7f7bb3df03b0> returned a result with an error set
          ERROR    ZODB.FileStorage:FileStorage.py:480 loading index
          UnicodeDecodeError: 'ascii' codec can't decode byte 0xb7 in position 25: ordinal not in range(128)
      
          ...
      
      -> Fix it by preparing tidmin in the test a 8-bytes binary properly.
      65ebbe7b
    • Kirill Smelkov's avatar
      *: Fix working on py3 by using bstr bytestring instead of raw bytes · 9861c136
      Kirill Smelkov authored
      e.g. for ObjectData .hashfunc:
      
      In many contexts we need that .hashfunc to be like string, e.g. for
      accessing hashRegistry by keys. In many other contexts - e.g. when
      zodbdump input it parsed or emitted, it is more handy to handle it like
      raw bytes.
      
      If we let .hashfunc to be of type str - it breaks the second mode. If of
      type bytes - it breaks the first mode.
      
      And also in many places it is hard to constantly encode/decode str and
      bytes, especially in the places where an object is sometimes used in
      strings context, and sometimes in binary context.
      
      -> Fix it all in one go by using bytestring type from pygolang,
      which provides both unicode string and binary semantics simultaneously.
      
      This needs bstr from pygolang (see kirr/pygolang@c9648c44),
      but even if pygolang comes without bstr, with this patch zodbtools
      continues to work ok on py2 - it will be just py3 mode that won't work.
      
      The list of test failures before this patch is provided below:
      
          _______________________________ test_zodbanalyze _______________________________
      
          tmpdir = local('/tmp/pytest-of-kirr/pytest-22/test_zodbanalyze0')
          capsys = <_pytest.capture.CaptureFixture object at 0x7f3de6835c70>
      
              def test_zodbanalyze(tmpdir, capsys):
                  tfs1 = fs1_testdata_py23(tmpdir,
                                  os.path.join(os.path.dirname(__file__), "testdata", "1.fs"))
      
                  for use_dbm in (False, True):
          >           report(
                          analyze(
                              tfs1,
                              use_dbm=use_dbm,
                              delta_fs=False,
                              tidmin=None,
                              tidmax=None,
                          ),
                          csv=False,
                      )
      
          zodbtools/test/test_analyze.py:30:
          _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _
      
          rep = <zodbtools.zodbanalyze.Report object at 0x7f3de5e16b20>, csv = False
      
              def report(rep, csv=False):
                  ...
                          print (fmtp % (t_display, rep.TYPEMAP[t], rep.TYPESIZE[t],
                                         pct, rep.TYPESIZE[t] * 1.0 / rep.TYPEMAP[t],
          >                              rep.COIDSMAP[t], rep.CBYTESMAP[t],
                                         rep.FOIDSMAP.get(t, 0), rep.FBYTESMAP.get(t, 0)))
          E               KeyError: b'persistent.mapping.PersistentMapping'
      
          zodbtools/zodbanalyze.py:147: KeyError
      
          ____________________________ test_zodbcommit[!zext] ____________________________
      
          zext = <function zext.<locals>._ at 0x7f3deb5c3e50>
      
              @func
              def test_zodbcommit(zext):
                  tmpd = mkdtemp('', 'zodbcommit.')
                  defer(lambda: rmtree(tmpd))
      
                  stor = storageFromURL('%s/2.fs' % tmpd)
                  defer(stor.close)
      
                  head = stor.lastTransaction()
      
                  # commit some transactions via zodbcommit and verify if storage dump gives
                  # what is expected.
                  t1 = Transaction(z64, ' ', b'user name', b'description ...', zext(dumps({'a': 'b'}, _protocol)), [
                      ObjectData(p64(1), b'data1', 'sha1', sha1(b'data1')),
                      ObjectData(p64(2), b'data2', 'sha1', sha1(b'data2'))])
      
                  t1.tid = zodbcommit(stor, head, t1)
      
                  t2 = Transaction(z64, ' ', b'user2', b'desc2', b'', [
                      ObjectDelete(p64(2))])
      
                  t2.tid = zodbcommit(stor, t1.tid, t2)
      
                  buf = BytesIO()
                  zodbdump(stor, p64(u64(head)+1), None, out=buf)
                  dumped = buf.getvalue()
      
          >       assert dumped == b''.join([_.zdump() for _ in (t1, t2)])
      
          zodbtools/test/test_commit.py:61:
          _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _
          zodbtools/test/test_commit.py:61: in <listcomp>
              assert dumped == b''.join([_.zdump() for _ in (t1, t2)])
          zodbtools/zodbdump.py:521: in zdump
              z += obj.zdump()
          _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _
      
          self = <zodbtools.zodbdump.ObjectData object at 0x7f3de5d26d90>
      
              def zdump(self):
                  data = self.data
                  hashonly = isinstance(data, HashOnly)
                  if hashonly:
                      size = data.size
                  else:
                      size = len(data)
          >       z = b'obj %s %d %s:%s' % (ashex(self.oid), size, self.hashfunc, ashex(self.hash_))
          E       TypeError: %b requires a bytes-like object, or an object that implements __bytes__, not 'str'
      
          zodbtools/zodbdump.py:569: TypeError
      
          _______________________________ test_dumpreader ________________________________
      
              def test_dumpreader():
                  in_ = b"""\
              txn 0123456789abcdef " "
              user "my name"
              description "o la-la..."
              extension "zzz123 def"
              obj 0000000000000001 delete
              obj 0000000000000002 from 0123456789abcdee
              obj 0000000000000003 54 adler32:01234567 -
              obj 0000000000000004 4 sha1:9865d483bc5a94f2e30056fc256ed3066af54d04
              ZZZZ
              obj 0000000000000005 9 crc32:52fdeac5
              ABC
      
              DEF!
      
              txn 0123456789abcdf0 " "
              user "author2"
              description "zzz"
              extension "qqq"
      
              """
      
                  r = DumpReader(BytesIO(in_))
          >       t1 = r.readtxn()
      
          zodbtools/test/test_dump.py:78:
          _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _
          zodbtools/zodbdump.py:443: in readtxn
              self._badline('unknown hash function %s' % qq(hashfunc))
          _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _
      
          self = <zodbtools.zodbdump.DumpReader object at 0x7f3de5d69cd0>
          msg = 'unknown hash function "adler32"'
      
              def _badline(self, msg):
          >       raise RuntimeError("%s+%d: invalid line: %s (%s)" % (_ioname(self._r), self.lineno, msg, qq(self._line)))
          E       RuntimeError: +7: invalid line: unknown hash function "adler32" ("obj 0000000000000003 54 adler32:01234567 -")
      
          zodbtools/zodbdump.py:382: RuntimeError
      
          ___________________________ test_zodbrestore[!zext] ____________________________
      
          tmpdir = local('/tmp/pytest-of-kirr/pytest-22/test_zodbrestore__zext_0')
          zext = <function zext.<locals>._ at 0x7f3de5d6ddc0>
      
              @func
              def test_zodbrestore(tmpdir, zext):
                  zkind = '_!zext' if zext.disabled else ''
      
                  # restore from testdata/1.zdump.ok and verify it gives result that is
                  # bit-to-bit identical to testdata/1.fs
                  tdata = dirname(__file__) + "/testdata"
                  @func
                  def _():
                      zdump = open("%s/1%s.zdump.raw.ok" % (tdata, zkind), 'rb')
                      defer(zdump.close)
      
                      stor = storageFromURL('%s/2.fs' % tmpdir)
                      defer(stor.close)
      
                      zodbrestore(stor, zdump)
          >       _()
      
          zodbtools/test/test_restore.py:49:
          _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _
          ../../venv/py3.venv/lib/python3.9/site-packages/decorator.py:232: in fun
              return caller(func, *(extras + args), **kw)
          ../../../tools/go/pygolang/golang/__init__.py:103: in _
              return f(*argv, **kw)
          zodbtools/test/test_restore.py:48: in _
              zodbrestore(stor, zdump)
          zodbtools/zodbrestore.py:39: in zodbrestore
              txn = zr.readtxn()
          zodbtools/zodbdump.py:443: in readtxn
              self._badline('unknown hash function %s' % qq(hashfunc))
          _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _
      
          self = <zodbtools.zodbdump.DumpReader object at 0x7f3de5d79e20>
          msg = 'unknown hash function "sha1"'
      
              def _badline(self, msg):
          >       raise RuntimeError("%s+%d: invalid line: %s (%s)" % (_ioname(self._r), self.lineno, msg, qq(self._line)))
          E       RuntimeError: /home/kirr/src/wendelin/z/zodbtools/zodbtools/test/testdata/1_!zext.zdump.raw.ok+5: invalid line: unknown hash function "sha1" ("obj 0000000000000000 61 sha1:664e6de0f153d8eaeda638d616a320c6e3c5feb1")
      
          zodbtools/zodbdump.py:382: RuntimeError
      9861c136
    • Kirill Smelkov's avatar
      zodbcommit: Fix stdin reading on py3 · b21fbe23
      Kirill Smelkov authored
      Zodbcommit reads input in zodbdump format from stdin and then uses
      zodbdump.DumpReader to parser that input. The parser works on binary
      data.
      
      However zodbcommit, was preparing that input data mixing bytes and
      strings, which is failing on py3:
      
          (py3.venv) kirr@deca:~/src/wendelin/z/zodbtools$ zodb commit 1.fs 00
          Ignoring index for /home/kirr/src/wendelin/z/zodbtools/1.fs
          aaa
          Traceback (most recent call last):
            File "/home/kirr/src/wendelin/venv/py3.venv/bin/zodb", line 33, in <module>
              sys.exit(load_entry_point('zodbtools', 'console_scripts', 'zodb')())
            File "/home/kirr/src/wendelin/z/zodbtools/zodbtools/zodb.py", line 129, in main
              return command_module.main(argv)
            File "/home/kirr/src/wendelin/venv/py3.venv/lib/python3.9/site-packages/decorator.py", line 232, in fun
              return caller(func, *(extras + args), **kw)
            File "/home/kirr/src/tools/go/pygolang/golang/__init__.py", line 103, in _
              return f(*argv, **kw)
            File "/home/kirr/src/wendelin/z/zodbtools/zodbtools/zodbcommit.py", line 222, in main
              zin += sys.stdin.read()
          TypeError: can't concat str to bytes
      
      -> Fix it by reading stdin in binary mode.
      
      No test currently as zodbcommit.main is not covered by tests (hopefully yet).
      b21fbe23
    • Kirill Smelkov's avatar
      zodbdump: Fix pickle disassembly on py3 · 69dc6de1
      Kirill Smelkov authored
      pickletools.dis, which is used to handle --pretty=zpickledis (*),
      expects output stream be text-like, not binary. We were passing a binary
      stream to it. As the result pickle disassembly was failing on py3:
      
          _______________________ test_zodbdump[!zext-zpickledis] ________________________
      
          tmpdir = local('/tmp/pytest-of-kirr/pytest-11/test_zodbdump__zext_zpickledis0')
          zext = <function zext.<locals>._ at 0x7f538b508670>, pretty = 'zpickledis'
      
              @mark.parametrize('pretty', ('raw', 'zpickledis'))
              def test_zodbdump(tmpdir, zext, pretty):
                  tdir  = dirname(__file__)
                  zkind = '_!zext' if zext.disabled else ''
                  tfs1  = fs1_testdata_py23(tmpdir, '%s/testdata/1%s.fs' % (tdir, zkind))
                  stor  = FileStorage(tfs1, read_only=True)
      
                  with open('%s/testdata/1%s.zdump.%s.ok' % (tdir, zkind, pretty), 'rb') as f:
                      dumpok = f.read()
      
                  out = BytesIO()
          >       zodbdump(stor, None, None, pretty=pretty, out=out)
      
          zodbtools/test/test_dump.py:48:
          _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _
          zodbtools/zodbdump.py:165: in zodbdump
              pickletools.dis(dataf, disf) # class
          _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _
      
          pickle = <_io.BytesIO object at 0x7f538b577130>
          out = <_io.BytesIO object at 0x7f538b49f8b0>, memo = {}, indentlevel = 4
          annotate = 0
      
              def dis(pickle, out=None, memo=None, indentlevel=4, annotate=0):
                  """Produce a symbolic disassembly of a pickle..."""
                  ...
                  for opcode, arg, pos in genops(pickle):
                      if pos is not None:
          >               print("%5d:" % pos, end=' ', file=out)
          E               TypeError: a bytes-like object is required, not 'str'
      
          /usr/lib/python3.9/pickletools.py:2450: TypeError
      
      -> Fix it by letting pickletools.dis to emit its output to StringIO instead of BytesIO.
      
      (*) see 80559a94 "zodbdump: support --pretty option with a format to show
          pickles disassembly"
      69dc6de1
    • Kirill Smelkov's avatar
      tests: Adjust testdata FileStorage for current Python on the fly · e825f80f
      Kirill Smelkov authored
      FileStorage/py2 uses `FS21` magic in file header, whereas
      FileStorage/py3 uses `FS30` magic:
      
          https://github.com/zopefoundation/ZODB/blob/0e72b8b13657/src/ZODB/_compat.py#L39
          https://github.com/zopefoundation/ZODB/blob/0e72b8b13657/src/ZODB/_compat.py#L74
      
      And if, upon opening the database, file magic does not match to what ZODB
      expects, open is rejected:
      
          https://github.com/zopefoundation/ZODB/blob/0e72b8b13657/src/ZODB/FileStorage/FileStorage.py#L88
          https://github.com/zopefoundation/ZODB/blob/0e72b8b13657/src/ZODB/FileStorage/FileStorage.py#L1625-L1630
      
      This is done with the idea for a database, that was written from
      Python2, to be rejected to be opened from Python3 and vice-versa because
      strings/bytes semantics changed in between py23.
      
      As the result, many zodbtools tests currently fail on py3 when they try
      to access prepared FileStorage database in testdata, because that
      database was originally prepared on py2. Here is, for example, how
      test_zodbdump fails:
      
          ___________________________ test_zodbdump[zext-raw] ____________________________
      
          zext = <function zext.<locals>._ at 0x7f28530bf9d0>, pretty = 'raw'
      
              @mark.parametrize('pretty', ('raw', 'zpickledis'))
              def test_zodbdump(zext, pretty):
                  tdir  = dirname(__file__)
                  zkind = '_!zext' if zext.disabled else ''
          >       stor  = FileStorage('%s/testdata/1%s.fs' % (tdir, zkind), read_only=True)
      
          zodbtools/test/test_dump.py:41:
          _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _
          ../ZODB/src/ZODB/FileStorage/FileStorage.py:315: in __init__
              self._pos, self._oid, tid = read_index(
          _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _
      
          file = <_io.BufferedReader name='/home/kirr/src/wendelin/z/zodbtools/zodbtools/test/testdata/1.fs'>
          name = '/home/kirr/src/wendelin/z/zodbtools/zodbtools/test/testdata/1.fs'
          index = <ZODB.fsIndex.fsIndex object at 0x7f2852fee2b0>, tindex = {}
          stop = b'\xff\xff\xff\xff\xff\xff\xff\xff'
          ltid = b'\x00\x00\x00\x00\x00\x00\x00\x00', start = 4
          maxoid = b'\x00\x00\x00\x00\x00\x00\x00\x00', recover = 0, read_only = True
      
              def read_index(file, name, index, tindex, stop=b'\377'*8,
                             ltid=z64, start=4, maxoid=z64, recover=0, read_only=0):
                  """Scan the file storage and update the index."""
                  ...
                  if file_size:
                      if file_size < start:
                          raise FileStorageFormatError(file.name)
                      seek(0)
                      if read(4) != packed_version:
          >               raise FileStorageFormatError(name)
          E               ZODB.FileStorage.FileStorage.FileStorageFormatError: /home/kirr/src/wendelin/z/zodbtools/zodbtools/test/testdata/1.fs
      
          ../ZODB/src/ZODB/FileStorage/FileStorage.py:1630: FileStorageFormatError
      
      Since zodbtools primarily work on raw data without decoding stored
      pickles, unlike Zope or ERP5, it should not be a problem for zodbtools
      to work on py3 with the database that was prepared on py2.
      
      -> Adjust all tests to use FileStorage data generated on the fly based
      on original files in testdata/ but with FileStorage header being
      rewritten to match current python.
      e825f80f
    • Kirill Smelkov's avatar
      util += writefile · 3cb93096
      Kirill Smelkov authored
      A counterpart to readfile - to write a file instead of reading it.
      We will need this function in the next patch.
      3cb93096
    • Kirill Smelkov's avatar
      util: Factor readfile function into here · adec18bd
      Kirill Smelkov authored
      Soon we will need to use it not only from test_restore.py
      adec18bd
  6. 29 Mar, 2022 1 commit
  7. 01 Apr, 2021 1 commit
    • Jérome Perrin's avatar
      zodbrestore: Mark restore-with-extension tests as xfail on ZODB4 · aa7e1966
      Jérome Perrin authored
      @kirr wrote (!19 (comment 129442))
      
          For the reference - contrary to ZODB5, restore tests on ZODB4 are currently
          [broken](https://nexedijs.erp5.net/#/test_result_module/20210317-B3AC205A/2).
          Restored file is not bit-to-bit identical to the original.
      
          The problem is that on commit/restore, we need to save
          user/description/extension. For extension `zodbdump.Transaction` provides
          .extension_bytes, which ZODB5 uses to save its raw copy. However ZODB4 goes
          through `.extension` and pickles it:
      
          https://lab.nexedi.com/nexedi/zodbtools/blob/129afa67/zodbtools/zodbdump.py#L425-453
          https://github.com/zopefoundation/ZODB/blob/4/src/ZODB/BaseStorage.py#L220-L240
      
          This leads to unpickle-repickle round-trip and different extension being committed on restore:
      
          ```diff
          diff --git a/1zdump b/2zdump
          index 5033bc1..a3a32aa 100644
          --- a/1zdump
          +++ b/2zdump
          @@ -10,7 +10,7 @@ q^A.
           txn 0285cbac3d0369e6 " "
           user "user0.0"
           description "step 0.0"
          -extension "\x80\x02}q\x01(U\tx-cookieSU\x05RF9IEU\vx-generatorq\x02U\fzodb/py2 (f)u."
          +extension "}q\x01(U\tx-cookieSU\x05RF9IEU\vx-generatorU\fzodb/py2 (f)u."
           obj 0000000000000000 98 sha1:eba252d1984f975ecb636bc1b3a89c953dd20527
          ...
          ```
      
          What might save us is to somehow in Transaction.extension returns a
          dict-subclass object that is somehow pickled to the exact bytes remembered when
          it was created. However, after briefly checking, I could not find a mechanism
          to do so yet...
      
      @jerome wrote (!19 (comment 129479))
      
          @kirr we already have pytest fixtures to test differently depending on whether
          the ZODB version has support for extension_bytes, so what about using it in the
          test and testing restoring the extension bytes version of the dump only for
          ZODB5 ?
      
      @kirr wrote (!19 (comment 129482))
      
          @jerome, yes we have this, but I believe we should actually fix zodbrestore to
          be reliable whatever ZODB is used. For ZODB5 it works. For ZODB4-wc2 we can
          adjust ZODB code to use extension_bytes similarly to how ZODB5 does. But
          unpatched ZODB4 is currently out of luck. As it was decided that Nexedi will
          use both ZODB4 and ZODB4-wc2, I think we should fix zodbrestore to work on all
          those versions to be reliable.
      
          /cc @tomo
      
      @kirr:
      
      -> No universal ZODB4 fix for now (this would require to monkey patch ZODB in
      several places), so mark "restore with extension" test as xfail similarly to
      how we already do for "dump with extension" test.
      
      This brings -ZODB4 and -ZODB4-wc2 tests back to PASS state.
      
      Even though on ZODB4 extension is restored not bit-to-bit exactly, it is
      restored to be the same dictionary equal to what was used to produce the
      dump. Not ideal, but still not loosing the information in practice.
      
      One more reason to switch to ZODB5...
      aa7e1966
  8. 16 Mar, 2021 2 commits
    • Kirill Smelkov's avatar
      zodbcommit: Provide full context when reporting errors · 129afa67
      Kirill Smelkov authored
      In the previous patch we taught object copy handler to report more
      details, but it was still incomplete - the error was missing details
      about which operation was run - commit, or restore of particular
      transaction.
      
      Noting that it can be also noted that other errors reported from that
      function lack such context.
      
      -> So fix it universally, at least for zodbcommit for now: set top-level
      runctx to topic of what we are doing, and use that runctx when
      generating errors. Runctx describes what we are running, and could be
      also later used for logging and tracing. That's why it is called runctx
      instead of just errctx for "error context".
      
      TODO currently it is only exceptions that we explicitly raise which get
      the context. If an exception is raised by something that we call - the
      context won't be added. It would be good to later rework error handling
      and append such context for any raised error. Defer and
      https://lab.nexedi.com/kirr/go123/blob/863c4602/xerr/__init__.py has
      something preliminary for this.
      
      The particular error when restoring a missing object copy becomes
      
          ValueError: /tmp/demo002868462/δ0285cbac75555580/δ.fs: restore 0285cbacb70a3db3 @0285cbacb258bf66: object 0000000000000003: copy from @0285cbac70a3d733: no data
      
      instead of older
      
          ValueError: /tmp/demo358030847/δ0285cbac75555580/δ.fs: object 0000000000000003: copy from @0285cbac70a3d733: no data
      
      /reviewed-by @jerome
      /reviewed-on nexedi/zodbtools!20
      129afa67
    • Kirill Smelkov's avatar
      zodbcommit: Robustify copy handling · fa00c283
      Kirill Smelkov authored
      When zodbdump input says to copy an object, we first load that object.
      However if object does not exist loadBefore raises POSKeyError, and when
      object at copied-from revision was deleted loadBefore returns None.
      
      -> Handle that explicitly to provide failure details to the user, so
      that instead of cryptic
      
          === RUN   TestLoad/δstart=0285cbac75555580
          Traceback (most recent call last):
            File "/usr/lib/python2.7/runpy.py", line 174, in _run_module_as_main
              "__main__", fname, loader, pkg_name)
            File "/usr/lib/python2.7/runpy.py", line 72, in _run_code
              exec code in run_globals
            File "/home/kirr/src/wendelin/z/zodbtools/zodbtools/zodb.py", line 133, in <module>
              main()
            File "/home/kirr/src/wendelin/z/zodbtools/zodbtools/zodb.py", line 129, in main
              return command_module.main(argv)
            File "<decorator-gen-6>", line 2, in main
            File "/home/kirr/src/tools/go/pygolang/golang/__init__.py", line 103, in _
              return f(*argv, **kw)
            File "/home/kirr/src/wendelin/z/zodbtools/zodbtools/zodbrestore.py", line 94, in main
              zodbrestore(stor, asbinstream(sys.stdin), _)
            File "/home/kirr/src/wendelin/z/zodbtools/zodbtools/zodbrestore.py", line 43, in zodbrestore
              zodbcommit(stor, at, txn)
            File "/home/kirr/src/wendelin/z/zodbtools/zodbtools/zodbcommit.py", line 122, in zodbcommit
              _()
            File "/home/kirr/src/wendelin/z/zodbtools/zodbtools/zodbcommit.py", line 91, in _
              data, _, _ = stor.loadBefore(obj.oid, p64(u64(obj.copy_from)+1))
          TypeError: 'NoneType' object is not iterable
              xtesting.go:483: /tmp/demo009767458/δ0285cbac75555580/δ.fs: zpyrestore: exit status 1
      
      it fails with something more understandable:
      
          === RUN   TestLoad/δstart=0285cbac75555580
          Traceback (most recent call last):
            File "/usr/lib/python2.7/runpy.py", line 174, in _run_module_as_main
              "__main__", fname, loader, pkg_name)
            File "/usr/lib/python2.7/runpy.py", line 72, in _run_code
              exec code in run_globals
            File "/home/kirr/src/wendelin/z/zodbtools/zodbtools/zodb.py", line 133, in <module>
              main()
            File "/home/kirr/src/wendelin/z/zodbtools/zodbtools/zodb.py", line 129, in main
              return command_module.main(argv)
            File "<decorator-gen-6>", line 2, in main
            File "/home/kirr/src/tools/go/pygolang/golang/__init__.py", line 103, in _
              return f(*argv, **kw)
            File "/home/kirr/src/wendelin/z/zodbtools/zodbtools/zodbrestore.py", line 94, in main
              zodbrestore(stor, asbinstream(sys.stdin), _)
            File "/home/kirr/src/wendelin/z/zodbtools/zodbtools/zodbrestore.py", line 43, in zodbrestore
              zodbcommit(stor, at, txn)
            File "/home/kirr/src/wendelin/z/zodbtools/zodbtools/zodbcommit.py", line 129, in zodbcommit
              _()
            File "/home/kirr/src/wendelin/z/zodbtools/zodbtools/zodbcommit.py", line 97, in _
              (stor.getName(), ashex(obj.oid), ashex(obj.copy_from)))
          ValueError: /tmp/demo358030847/δ0285cbac75555580/δ.fs: object 0000000000000003: copy from @0285cbac70a3d733: no data
              xtesting.go:483: /tmp/demo358030847/δ0285cbac75555580/δ.fs: zpyrestore: exit status 1
      
      For the implementation it would be easier to use loadAt
      (https://github.com/zopefoundation/ZODB/pull/323), but we don't have
      that yet.
      
      /reviewed-by @jerome
      /reviewed-on nexedi/zodbtools!20
      fa00c283
  9. 15 Mar, 2021 4 commits
  10. 10 Mar, 2021 2 commits
  11. 02 Nov, 2020 1 commit
  12. 30 Apr, 2020 1 commit
  13. 29 Apr, 2020 6 commits
    • Kirill Smelkov's avatar
      tidrange: test: Fix for py3 · 2236aaaf
      Kirill Smelkov authored
      ashex gives bytes, whereas reference_tid was str.
      2236aaaf
    • Kirill Smelkov's avatar
      *: dict.keys() returns sequence, not [] on py3 · 7851a964
      Kirill Smelkov authored
      The sequence cannot be randomly accessed, e.g.
      
          In [5]: d = {1:2}
      
          In [6]: kv = d.keys()
      
          In [7]: kv
          Out[7]: dict_keys([1])
      
          In [8]: kv[0]
          ---------------------------------------------------------------------------
          TypeError                                 Traceback (most recent call last)
          <ipython-input-8-643f90e1910b> in <module>()
          ----> 1 kv[0]
      
          TypeError: 'dict_keys' object is not subscriptable
      
      -> Use list(dict.keys()) in places where we need random access.
      7851a964
    • Kirill Smelkov's avatar
      *: Pass bytes literal into BytesIO · 2f9e0623
      Kirill Smelkov authored
      Otherwise it breaks with str on py3:
      
      	In [1]: from io import BytesIO
      
      	In [2]: BytesIO("abc")
      	---------------------------------------------------------------------------
      	TypeError                                 Traceback (most recent call last)
      	<ipython-input-2-52a130edd46d> in <module>()
      	----> 1 BytesIO("abc")
      
      	TypeError: a bytes-like object is required, not 'str'
      2f9e0623
    • Kirill Smelkov's avatar
      zodbdump: Use bytes to emit its output · d3152c78
      Kirill Smelkov authored
      Zodbdump format is text-binary and is saved into files opened in binary
      mode. -> We have to emit bytes - not strings - into it, since otherwise
      on Python3 it would break.
      
      This needs qq support from pygolang[1] to be able to use qq with both
      string and bytestring format, e.g. for
      
      	 "hello %s" % qq(name),	and
      	b"hello %s" % qq(name)
      
      to give the same output irregardless of whether name is str or bytes.
      
      [1] nexedi/pygolang!1
      d3152c78
    • Kirill Smelkov's avatar
      *: Zodbdump format is semi text-binary: Mark it as such + handle zdump output as binary · ddd5fd03
      Kirill Smelkov authored
      Zodbdump format is already described as semi text-binary in top-level
      zodbdump.py documentation. However zdump() docstring was referring to it
      as "text". Fix it and use binary to handle places where zdump is
      loaded/saved.
      ddd5fd03
    • Kirill Smelkov's avatar
      *: Don't use %r to print/report lines/bytes to outside · bc608aea
      Kirill Smelkov authored
      %r has different output for strings and bytes on python3:
      
      	In [1]: a = 'hello'
      	In [2]: b = b'hello'
      
      	In [3]: repr(a)
      	Out[3]: "'hello'"
      
      	In [4]: repr(b)
      	Out[4]: "b'hello'"
      
      -> Use qq whose output is stable irregardless of whether input is string or bytes.
      bc608aea
  14. 13 Mar, 2020 1 commit
  15. 14 Feb, 2020 1 commit
  16. 09 Jul, 2019 1 commit
  17. 03 Jun, 2019 1 commit
    • Kirill Smelkov's avatar
      More python3 compatibility · b44f9c0d
      Kirill Smelkov authored
      @jerome, I was trying to make zodbtools work with Python3 and along that road picked some bits of your work from nexedi/zodbtools!12. At present the migration to Python3 is not complete, and even though now I have the answer to how handle strings in both python2/3 in compatible and reasonable way (I can share details if you are interested), I have to put that work on hold for some time and use https://pypi.org/project/pep3134 directly in wcfs tests, since getting all string details right, even after figuring on how to do it, will take time. Anyway the bits presented here should be ready for master and could be merged now. Could you please have a look?
      
      Thanks beforehand,  
      Kirill
      
      /reviewed-on nexedi/zodbtools!13
      b44f9c0d
  18. 24 May, 2019 5 commits
    • Kirill Smelkov's avatar
      zodbdump: Default out to stdout in binary mode · c5f20201
      Kirill Smelkov authored
      Zodbdump format is mixed text+binary so dumping to unicode stdout won't
      work.
      
      Based on patch by Jérome Perrin.
      c5f20201
    • Kirill Smelkov's avatar
      *: s.decode('hex') -> fromhex(s) · b508f108
      Kirill Smelkov authored
      Because on Py3:
      
              def test_dumpreader():
                  in_ = b"""\
              txn 0123456789abcdef " "
              user "my name"
              description "o la-la..."
              extension "zzz123 def"
              obj 0000000000000001 delete
              obj 0000000000000002 from 0123456789abcdee
              obj 0000000000000003 54 adler32:01234567 -
              obj 0000000000000004 4 sha1:9865d483bc5a94f2e30056fc256ed3066af54d04
              ZZZZ
              obj 0000000000000005 9 crc32:52fdeac5
              ABC
      
              DEF!
      
              txn 0123456789abcdf0 " "
              user "author2"
              description "zzz"
              extension "qqq"
      
              """
      
                  r = DumpReader(BytesIO(in_))
                  t1 = r.readtxn()
                  assert isinstance(t1, Transaction)
          >       assert t1.tid == '0123456789abcdef'.decode('hex')
          E       AttributeError: 'str' object has no attribute 'decode'
      
          test/test_dump.py:77: AttributeError
      
      Based on patch by Jérome Perrin.
      b508f108
    • Kirill Smelkov's avatar
      utils: Initialize hashers with bytes · 1418c86f
      Kirill Smelkov authored
      	self = <zodbtools.util.CRC32Hasher object at 0x7f887ae465f8>
      
      	    def __init__(self):
      	>       self._h = crc32('')
      	E       TypeError: a bytes-like object is required, not 'str'
      
      	util.py:208: TypeError
      
      Based on patch by Jérome Perrin.
      1418c86f
    • Kirill Smelkov's avatar
      *: Pass bytes - not unicode - literals to sha1() · a7eee284
      Kirill Smelkov authored
      	data = 'data1'
      
      	    def sha1(data):
      	        m = hashlib.sha1()
      	>       m.update(data)
      	E       TypeError: Unicode-objects must be encoded before hashing
      
      	zodbtools/util.py:38: TypeError
      
      Based on patch by Jérome Perrin.
      a7eee284
    • Kirill Smelkov's avatar
      util: Fix ashex for Python3 · 7a7370e6
      Kirill Smelkov authored
      	s = b'\x03\xc4\x85v\x00\x00\x00\x00'
      
      	    def ashex(s):
      	>       return s.encode('hex')
      	E       AttributeError: 'bytes' object has no attribute 'encode'
      
      	zodbtools/util.py:29: AttributeError
      
      s.encode('hex') used to work on Py2 but fails on Py3:
      
      	In [1]: s = "abc"
      
      	In [2]: b = b"def"
      
      	In [3]: s.encode('hex')
      	---------------------------------------------------------------------------
      	LookupError                               Traceback (most recent call last)
      	<ipython-input-3-75ae843597fe> in <module>()
      	----> 1 s.encode('hex')
      
      	LookupError: 'hex' is not a text encoding; use codecs.encode() to handle arbitrary codecs
      
      	In [4]: b.encode('hex')
      	---------------------------------------------------------------------------
      	AttributeError                            Traceback (most recent call last)
      	<ipython-input-4-ec2fccff20bc> in <module>()
      	----> 1 b.encode('hex')
      
      	AttributeError: 'bytes' object has no attribute 'encode'
      
      	In [5]: import codecs
      
      	In [6]: codecs.encode(b, 'hex')
      	Out[6]: b'646566'
      
      	In [7]: codecs.encode(s, 'hex')
      	---------------------------------------------------------------------------
      	TypeError                                 Traceback (most recent call last)
      	/usr/lib/python3.7/encodings/hex_codec.py in hex_encode(input, errors)
      	     14     assert errors == 'strict'
      	---> 15     return (binascii.b2a_hex(input), len(input))
      	     16
      
      	TypeError: a bytes-like object is required, not 'str'
      
      	The above exception was the direct cause of the following exception:
      
      	TypeError                                 Traceback (most recent call last)
      	<ipython-input-7-7fcb16cead4f> in <module>()
      	----> 1 codecs.encode(s, 'hex')
      
      	TypeError: encoding with 'hex' codec failed (TypeError: a bytes-like object is required, not 'str')
      
      After the patch it works with bytes and raises for str.
      Fromhex does not need to be changed - it already uses codecs.decode way as
      originally added in dd959b28 (zodbdump += DumpReader - to read/parse zodbdump
      stream).
      
      Based on patch by Jérome Perrin.
      7a7370e6