1. 24 Sep, 2024 23 commits
    • Kirill Smelkov's avatar
      X go/neo: Update it after merging ZODB/go StrictUnicode + other py3-related improvements · 6235fb60
      Kirill Smelkov authored
      FileStorage testdata now lives in different place(s) and is different
      because gen_testdata.py in zodbtools changed.
      
      -> Adjust fs1 testdata path to load and regenerate neo/storage/sqlite from it.
      6235fb60
    • Kirill Smelkov's avatar
      Merge branch 'y/py3' into t+ypy3 · fbddd624
      Kirill Smelkov authored
      * y/py3: (32 commits)
        go/zodb: Fix _TestPersistentMapListLoad on py3_pickle3
        go/zodbpickle: Switch to unpickle with PyDict=y mode
        go/go.mode: v↑ github.com/kisielk/og-rek to pre-release with PyDict patches
        go/zodb/btree: go/zodb/fs1tools: Test it on all py2/py3 ZODB kinds of data we care about
        go/zodb/fs1: Remove old testdata files
        go/internal/xtesting: Switch to updated FileStorage testdata to validate LoadDBHistory
        go/zodbpickle: Switch to pickling with protocol=3
        go/zodb/zeo: Fix ZRPC/pickle binary and string encoding/decoding wrt py3
        go/zodb/zeo: Test it on all py2/py3 ZODB kinds of data we care about and wrt both ZEO/py2 and ZEO/py3
        go/zodb/demo: Test it on all py2/py3 ZODB kinds of data we care about
        go/zodb/zodbtools: Test it on all py2/py3 ZODB kinds of data we care about
        go/zodb/fs1tools: test: Normalize dumpOk from py3 to py2 syntax
        go/zodb/fs1tools: Test it on all py2/py3 ZODB kinds of data we care about
        go/zodb/fs1tools: DumperFsDump: Print size of every transaction record
        go/zodb/fs1tools: Regenerate testdata with ZODB 5.8.1
        go/zodb/fs1: Fix index save wrt py3
        go/zodb/fs1: Fix index load from py3 data
        go/zodb/fs1: Accept both FS21 and FS30 magics
        go/zodb/fs1: Test FileStorage on all py2/py3 ZODB kinds of data we care about
        go/zodb: Fix PyData.ClassName for py3_pickle3
        ...
      fbddd624
    • Kirill Smelkov's avatar
      go/zodb: Fix _TestPersistentMapListLoad on py3_pickle3 · 0a58eb7a
      Kirill Smelkov authored
      plist is generated to come with str in [0]. On py2 that is bytestring,
      while on py3 it is unicode. However were trying to decode it as
      bytestring only and so it was failing as
      
          --- FAIL: TestPersistentMapListLoad/py3_pickle3 (0.73s)
              persistent_x_test.go:82: plist[0]: got string  ; expect str
      
      -> Fix it by using ogórek.AsString helper as suggested by
      StrictUnicode=y mode documentation in https://github.com/kisielk/og-rek/commit/b28613c2.
      0a58eb7a
    • Kirill Smelkov's avatar
      go/zodbpickle: Switch to unpickle with PyDict=y mode · 1e87fe5b
      Kirill Smelkov authored
      Previously we were unpickling dicts into map[any]any and it was making a
      difference from which Python version the dict was saved: if it was saved
      from py2 with str keys, then the dict needed to be accessed with ByteString
      keys, while if it was saved from py3 with str keys, the dict needed to
      be accessed with string keys. This peculiarity is one of the reasons why
      Map test currently fails on py3_pickle3 - because there - both in the
      test and inside Map.PySetState, dictionaries are accessed via ByteString
      keys which works only wrt py2-saved data.
      
      -> Fix it by switching the unpickler to work in PyDict=y mode(*) which
      results in dicts unpickled as ogórek.Dict instead of map[any]any
      with the ogórek.Dict considering ByteString to be equal to both string and Bytes
      with the same underlying content. This allows programs to access Dict via
      string/bytes keys following Python3 model, while still being able to handle
      dictionaries generated from under Python2.
      
      For ZODB/go this is change in behaviour exposed to outside. However
      there is currently only one known ZODB/go user - WCFS in Wendelin.core -
      and that user will be updated correspondingly as well.
      
      The change fixes the following failure in go/zodb tests
      
          --- FAIL: TestPersistentMapListLoad/py3_pickle3 (0.70s)
              persistent_x_test.go:57: persistent.mapping.PersistentMapping(0000000000000000): activate: pysetstate: noone of ["data" "_container"] is present in state dict
      
      but one more remains there which will be addressed in the next patch.
      
      (*) see https://github.com/kisielk/og-rek/pull/75 for details.
      1e87fe5b
    • Kirill Smelkov's avatar
      go/go.mode: v↑ github.com/kisielk/og-rek to pre-release with PyDict patches · 90799e25
      Kirill Smelkov authored
      We will need PyDict mode in the next patch to fix Map test on
      py3_pickle3 and so that working with unpickled dictionaries is easy
      independently from which Python version - py2 or py3 - they come.
      
      PyDict mode patch: https://github.com/kisielk/og-rek/commit/3bf6c92d
      		  (https://github.com/kisielk/og-rek/pull/75)
      
      Whole set of brought changes:
      
      	https://github.com/kisielk/og-rek/compare/b1bd3f2e3d5e...e691997e3596
      90799e25
    • Kirill Smelkov's avatar
      go/zodb/btree: go/zodb/fs1tools: Test it on all py2/py3 ZODB kinds of data we care about · 6841930a
      Kirill Smelkov authored
      Previously, like with FileStorage, we were testing btree/go only with
      data generated by python2 and pickle protocol=2. However even on py2
      there are more pickle protocols that are in use, and also there is python3.
      
      -> Modernize testdata/gen-testdata to use run_with_all_zodb_pickle_kinds
         that was recently added as part of nexedi/zodbtools@f9d36ba7
         and generate test data with both python2 and python3. It is handy to
         use py2py3-venv(*) to prepare python environment to do that.
      
         Adjust tests on Go side to verify how btree handles all generated zkinds.
      
      All py2_pickle1, py2_pickle2, py2_pickle3 and py3_pickle3 are handled well out of the box.
      
      (*) see nexedi/zodbtools@fac2f190
      6841930a
    • Kirill Smelkov's avatar
      go/zodb/fs1: Remove old testdata files · 0dd81a41
      Kirill Smelkov authored
      We recently modrnized FileStorage/go tests to have all ZODB kinds of
      data which live in testdata/<zkind>/ for example testdata/py2_pickle3/ -
      see recent patch "go/zodb/fs1: Test FileStorage on all py2/py3 ZODB
      kinds of data we care about".
      
      We also updated all other packages to use that new FileStorage testdata.
      
      -> Now is the time for old testdata to go.
      0dd81a41
    • Kirill Smelkov's avatar
      go/internal/xtesting: Switch to updated FileStorage testdata to validate LoadDBHistory · bfd9eeb4
      Kirill Smelkov authored
      Old fs1/testdata/1.fs will be going away -> we need to switch to using
      newly generated multi-zkind testdata. However for this test using only
      one zkind should be enough. Let's pick what we currently primarily use:
      py2_pickle3.
      bfd9eeb4
    • Kirill Smelkov's avatar
      go/zodbpickle: Switch to pickling with protocol=3 · 1bb0eb24
      Kirill Smelkov authored
      Pickle protocol 3 allows to natively support all py bytes and string
      types and is supported by zodbpickle/py on both py2 and py3 out of the
      box. ZODB/py3 is using protocol=3 since 2013 and ZODB/py2 since 2018. In
      other words nowdays protocol=3 is the default protocol with which
      ZODB emits pickles, and it can be read practically everywhere.
      
      The difference in between protocol=2 and protocol=3 is addition of
      BINBYTES and SHORT_BINBYTES opcodes to represent bytes. Without those
      opcodes bytes are encoded as
      
          `_codecs.encode(byt.decode('latin1'), 'latin1')`
      
      which, when unpickled, results in bytes on py3 and str on py2.
      
      Compared to using `BINBYTES byt` this form is much bigger in size, but
      what is more important might turn bytes back into str when decoded and
      reencoded on py2.
      
      This form of bytes encoding is also not accepted by ZEO/py which rejects
      it with
      
          2024-07-18T20:44:39 ERROR ZEO.asyncio.server Can't deserialize message
          Traceback (most recent call last):
            File "/home/kirr/src/wendelin/z/ZEO5/src/ZEO/asyncio/server.py", line 100, in message_received
              message_id, async_, name, args = self.decode(message)
            File "/home/kirr/src/wendelin/z/ZEO5/src/ZEO/asyncio/marshal.py", line 119, in pickle_server_decode
              return unpickler.load()  # msgid, flags, name, args
            File "/home/kirr/src/wendelin/z/ZEO5/src/ZEO/asyncio/marshal.py", line 176, in server_find_global
              raise ImportError("Module not allowed: %s" % (module,))
          ImportError: Module not allowed: _codecs
      
      All in all using pickle protocol=3 is ok from both py2 and py3 point of
      view, brings size optimization and correctness, and fixes ZEO and
      probably other issues.
      
      So let the pickles ZODB/go saves and other Go places emit be encoded with protocol=3 now as well.
      
      For the reference, the
      
          // TODO 2 -> 3 since ZODB5 switched to it and uses zodbpickle.
      
      is from 2019 added in a16c9e06 (go/zodb: Teach Persistent to serialize
      itself).
      1bb0eb24
    • Kirill Smelkov's avatar
      go/zodb/zeo: Fix ZRPC/pickle binary and string encoding/decoding wrt py3 · 9f402fcc
      Kirill Smelkov authored
      ZEO/go client was asserting "method" field to be bytestring. As the
      result interoperation with ZEO/py3 server was failing as e.g.
      
          2024-07-18T20:41:14 INFO ZEO.asyncio.server received handshake 'Z5'
          2024/07/18 20:41:14 zlink @ - /tmp/zeo2281880331/1.fs.zeosock: serveRecv1: decode: .0: method: got string; expected str
              zeo_test.go:221: openzeo /tmp/zeo2281880331/1.fs.zeosock: zeo: open /tmp/zeo2281880331/1.fs.zeosock: zlink @ - /tmp/zeo2281880331/1.fs.zeosock: call register: zlink is closed
      
      because ZEO/py3 sends "method" as unicode.
      
      -> Fix it by using pickle.AsString when decoding "method" field.
      
      Similarly ZEO/go client was sending strings as bytestr, which on py3 decode to
      unicode ok only if they are ASCII. Adjust string encoding to be emitted
      as unicode always which works wrt both ZEO/py3 and ZEO/py2.
      
      --------
      
      ZEO/go client was sending binary ZRPC arguments as bytestr as well,
      which was resulting in the following exception on ZEO/py3 side:
      
          2024-07-18T20:42:31 ERROR ZEO.asyncio.server Can't deserialize message
          Traceback (most recent call last):
            File "/home/kirr/src/wendelin/z/ZEO/src/ZEO/asyncio/server.py", line 102, in message_received
              message_id, async_, name, args = self.decode(message)
            File "/home/kirr/src/wendelin/z/ZEO/src/ZEO/asyncio/marshal.py", line 122, in pickle_server_decode
              return unpickler.load()  # msgid, flags, name, args
          UnicodeDecodeError: 'ascii' codec can't decode byte 0x85 in position 1: ordinal not in range(128)
          2024/07/18 20:42:31 zlink @ - /tmp/zeo627526011/1.fs.zeosock: recvPkt: EOF
              xtesting.go:361: load 0285cbac70a3d733:0000000000000001: returned err unexpected:
                  have: /tmp/zeo627526011/1.fs.zeosock: load 0285cbac70a3d733:0000000000000001: zlink @ - /tmp/zeo627526011/1.fs.zeosock: call loadBefore: zlink is closed
                  want: nil
      
      -> Fix it by using pickle.Bytes which should be encoded as bytes in the
         pickle stream.
      
      This fix is ok but depends on the next patch to upgrade ZODB
      serialization protocol to 3, because on pickle protocol 2 bytes are
      serialized as `_codecs.encode(byt.decode('latin1'), 'latin1')` and
      ZEO/py forbids import of _codecs:
      
          2024-07-18T20:44:39 ERROR ZEO.asyncio.server Can't deserialize message
          Traceback (most recent call last):
            File "/home/kirr/src/wendelin/z/ZEO5/src/ZEO/asyncio/server.py", line 100, in message_received
              message_id, async_, name, args = self.decode(message)
            File "/home/kirr/src/wendelin/z/ZEO5/src/ZEO/asyncio/marshal.py", line 119, in pickle_server_decode
              return unpickler.load()  # msgid, flags, name, args
            File "/home/kirr/src/wendelin/z/ZEO5/src/ZEO/asyncio/marshal.py", line 176, in server_find_global
              raise ImportError("Module not allowed: %s" % (module,))
          ImportError: Module not allowed: _codecs
      
      We will address this in the next patch.
      9f402fcc
    • Kirill Smelkov's avatar
      go/zodb/zeo: Test it on all py2/py3 ZODB kinds of data we care about and wrt... · df8d5cc3
      Kirill Smelkov authored
      go/zodb/zeo: Test it on all py2/py3 ZODB kinds of data we care about and wrt both ZEO/py2 and ZEO/py3
      
      Similarly to FileStorage, fs1tools and other Go packages previously we
      were testing ZEO/go client only with old FileStorage testdata generated
      by python2 and pickle protocol=2. However even on py2 there are more
      pickle protocols that are in use, and also there is python3.
      
      We were also testing our ZEO/go client only wrt ZEO/py2 but there is
      also ZEO/py3.
      
      -> Adjust ZEO/go testing to automatically load and test against all ZODB
      kinds from recently updated FileStorage testdata and wrt both ZEO/py2
      and ZEO/py3.
      
      All py2_pickle1, py2_pickle2, py2_pickle3 and py3_pickle3 are handled well out of the box.
      However only ZEO/py2 succeeds: tests wrt ZEO/py3 server currently fail
      and so are marked with "xfail".
      
      We will fix tests for ZEO/py3 in the next patch.
      df8d5cc3
    • Kirill Smelkov's avatar
      go/zodb/demo: Test it on all py2/py3 ZODB kinds of data we care about · e76c9501
      Kirill Smelkov authored
      Similarly to FileStorage, fs1tools and zodbtools previously we were testing
      demo only with old FileStorage testdata generated by python2 and pickle
      protocol=2. However even on py2 there are more pickle protocols that
      are in use, and also there is python3.
      
      -> Adjust demo testing to automatically load and test agains all ZODB
      kinds from recently updated FileStorage testdata.
      
      All py2_pickle1, py2_pickle2, py2_pickle3 and py3_pickle3 are handled well out of the box.
      e76c9501
    • Kirill Smelkov's avatar
      go/zodb/zodbtools: Test it on all py2/py3 ZODB kinds of data we care about · c67e2fec
      Kirill Smelkov authored
      Like with FileStorage and fs1tools previously we were testing zodbtools/go only
      with data generated by python2 and pickle protocol=2. However even on py2 there
      are more pickle protocols that are in use, and also there is python3.
      
      -> Modernize testdata generation to automatically pick up all ZODB kinds
         from recently-updated FileStorage/go testdata.
      
         Adjust tests on Go side to verify how zodbtools/go handle all generated zkinds.
      
      All py2_pickle1, py2_pickle2, py2_pickle3 and py3_pickle3 are handled well out of the box.
      c67e2fec
    • Kirill Smelkov's avatar
      go/zodb/fs1tools: test: Normalize dumpOk from py3 to py2 syntax · 93e2f065
      Kirill Smelkov authored
      Because else, py3_pickle3 tests fail e.g. as
      
              --- FAIL: TestFsDump/py3_pickle3/db=1 (0.00s)
                  dump_test.go:70: fsdump: dump different:
                       Trans #00000 tid=0285cbac70a3d733 size=211 time=1979-01-03 21:00:26.400001 offset=117
                      -    status='p' user=b'user0.12' description=b'step 0.12'
                      +    status='p' user='user0.12' description='step 0.12'
      
              --- FAIL: TestFsDumpv/py3_pickle3/db=1 (0.00s)
                  dump_test.go:70: fsdumpv: dump different:
                       ************************************************************
                      -file identifier: b'FS30'
                      +file identifier: 'FS30'
      
              ...
      
      fs1tools/go sticks to py2 syntax because in my view it is more user
      friendly. The preference of py2 or py3 syntax needs to be made anyway
      because for fs1tools/go it is not possible to know beforehand to which
      py kind its output will be compared against.
      
      I think it is not very good for fstools/py output to be dependent on
      python version. For the reference on this topic: output of zodbtools/py
      is designed to be the same on both py2 and py3.
      93e2f065
    • Kirill Smelkov's avatar
      go/zodb/fs1tools: Test it on all py2/py3 ZODB kinds of data we care about · 117172cf
      Kirill Smelkov authored
      Like with FileStorage previously we were testing fs1tools only with data
      generated by python2 and pickle protocol=2. However even on py2 there are
      more pickle protocols that are in use, and also there is python3.
      
      -> Modernize testdata generation to automatically pick up all ZODB kinds
         from recently-updated FileStorage/go testdata.
      
         Adjust tests on Go side to verify how fs1tools handles all generated zkinds.
      
      py2_pickle1, py2_pickle2 and py2_pickle3 are handled well.
      Tests for py3_pickle3 currently fail and so are marked with "xfail".
      
      We will fix tests for py3_pickle3 in follow-up patches.
      117172cf
    • Kirill Smelkov's avatar
      go/zodb/fs1tools: DumperFsDump: Print size of every transaction record · 9d10dcc5
      Kirill Smelkov authored
      Like fsdump/py does.
      
      https://github.com/zopefoundation/ZODB/commit/403f9869 started to print
      size of every transaction record saying that it readded it. And indeed
      this printing was there starting from
      https://github.com/zopefoundation/ZODB/commit/06e757b3 authored in 2003.
      
      Both commits use `size(txn) = trans._tend - trans._tpos` which is wrong
      by 8 because during file iteration tpos points to the beginning of
      transaction and tend points to tpos + tlen, but tlen is full
      transaction length - 8:
      
      https://github.com/zopefoundation/ZODB/blob/5.8.1/src/ZODB/FileStorage/FileStorage.py#L1997-L1998
      https://github.com/zopefoundation/ZODB/blob/5.8.1/src/ZODB/FileStorage/format.py#L28
      
      Mimic fsdump/py behaviour exactly for compatibility, even if it is a bit
      buggy, since it was there for such a long time.
      
      Without the fix TestFsDump was failing like this:
      
          --- FAIL: TestFsDump/db=1 (0.00s)
              dump_test.go:70: fsdump: dump different:
                  -Trans #00000 tid=0285cbac258bf266 size=151 time=1979-01-03 21:00:08.800000 offset=52
                  +Trans #00000 tid=0285cbac258bf266 time=1979-01-03 21:00:08.800000 offset=52
                       status=' ' user='' description='initial database creation'
                     data #00000 oid=0000000000000000 size=61 class=persistent.mapping.PersistentMapping
                   ...
      9d10dcc5
    • Kirill Smelkov's avatar
      go/zodb/fs1tools: Regenerate testdata with ZODB 5.8.1 · ce122ee7
      Kirill Smelkov authored
      https://github.com/zopefoundation/ZODB/commit/403f9869 adjusted fsdump
      to emit size of transaction record. Everything else remains unchanged.
      
      Our fsdump becomes broken, because it does not yet emit that size(txn).
      We will fix it in the next patch.
      ce122ee7
    • Kirill Smelkov's avatar
      go/zodb/fs1: Fix index save wrt py3 · ef615203
      Kirill Smelkov authored
      We were saving index data with bytestrings inside index records. This
      works for py2 but on py3 it causes failure on unpickling because, by default,
      *STRING opcodes are unpickled into unicode on py3:
      
          === RUN   TestIndexSaveToPy/py2_pickle1/py3
          UnicodeDecodeError: 'ascii' codec can't decode byte 0x8d in position 25: ordinal not in range(128)
      
          The above exception was the direct cause of the following exception:
      
          Traceback (most recent call last):
            File "/home/kirr/src/neo/src/lab.nexedi.com/kirr/neo/go/zodb/storage/fs1/./py/indexcmp", line 43, in <module>
              main()
            File "/home/kirr/src/neo/src/lab.nexedi.com/kirr/neo/go/zodb/storage/fs1/./py/indexcmp", line 30, in main
              d1 = fsIndex.load(path1)
            File "/home/kirr/src/wendelin/z/ZODB/src/ZODB/fsIndex.py", line 138, in load
              v = unpickler.load()
          SystemError: <built-in method read of _io.BufferedReader object at 0x7fb631a86670> returned a result with an error set
              index_test.go:229: zodb/py read/compare index: exit status 1
      
      -> Fix it by explicitly emitting index entries in binary form.
      
      To be able to compare "index from py2 -> loaded into go -> saved by go ->
      compared by py3" implement a bit of backward compatibility for  loading py2
      index files on py3. Do this compatibility only for known-good py2 files, not
      index produced by go code which is loaded by py3 without any compatibility
      shims.
      ef615203
    • Kirill Smelkov's avatar
      go/zodb/fs1: Fix index load from py3 data · 82098c49
      Kirill Smelkov authored
      Python3 emits index entries using bytes, but index.go was expecting only
      bytestrings. As the result loading index produced by py3 was failing
      e.g. as
      
          index_test.go:201: index load: testdata/py3_pickle3/1.fs.index: pickle @6: invalid oidPrefix: type ogórek.Bytes
      
      -> Fix it by accepting both bytes and bytestrings inside the index.
      82098c49
    • Kirill Smelkov's avatar
      go/zodb/fs1: Accept both FS21 and FS30 magics · 8bae132e
      Kirill Smelkov authored
      FileStorage/py2 saves data with FS21 magic, while FileStorage/py3 with
      FS30 magic:
      
      https://github.com/zopefoundation/ZODB/blob/5.8.1/src/ZODB/_compat.py#L25-L77
      
      Up till now FileStorage/go was accepting only FS21 and so with
      py3-generated data it was leading to e.g. the following failure:
      
          === RUN   TestEmptyDB/py3_pickle3
              filestorage_test.go:90: testdata/py3_pickle3/empty.fs: invalid fs1 magic "FS30"
      
      -> Fix it by accepting both py2 and py3 FileStorage magics.
      
      We do accept them both without any other change because FileStorage
      format, even when it comes with those two different magic, is really the
      same.
      8bae132e
    • Kirill Smelkov's avatar
      go/zodb/fs1: Test FileStorage on all py2/py3 ZODB kinds of data we care about · ede4b3bb
      Kirill Smelkov authored
      Previously we were testing FileStorage/go only with data generated by
      python2 and pickle protocol=2. However even on py2 there are more pickle
      protocols that are in use, and also there is python3.
      
      -> Modernize py/gen-testdata to use run_with_all_zodb_pickle_kinds
         that was recently added as part of nexedi/zodbtools@f9d36ba7
         and generate test data with both python2 and python3. It is handy to
         use py2py3-venv(*) to prepare python environment to do that.
      
         Adjust tests on Go side to verify how FileStorage handles all generated zkinds.
      
      py2_pickle1, py2_pickle2 and py2_pickle3 are handled well.
      Tests for py3_pickle3 currently fail and so are marked with "xfail".
      
      We will fix tests for py3_pickle3 in follow-up patches.
      
      Old testdata are not yet removed because e.g. fs1tools and zodbdump
      tests depend on them. We will remove old fs1 testdata after adjusting
      tests in dependent packages step-by-step.
      
      (*) see nexedi/zodbtools@fac2f190
      ede4b3bb
    • Kirill Smelkov's avatar
      go/zodb: Fix PyData.ClassName for py3_pickle3 · a865708a
      Kirill Smelkov authored
      Test for PyData.ClassName currently fails for py3_pickle3 zkind:
      
          === RUN   TestPyClassName/py3_pickle3
              pydata_test.go:50: class name for "\x80\x03X\x18\x00\x00\x00ZODB.tests.testSerializeq\x00X\x13\x00\x00\x00ClassWithoutNewargsq\x01\x86q\x02N\x86q\x03.":
                  have: "?.?"
                  want: "ZODB.tests.testSerialize.ClassWithoutNewargs"
              pydata_test.go:50: class name for "\x80\x03X\x18\x00\x00\x00ZODB.tests.testSerializeq\x00X\x10\x00\x00\x00ClassWithNewargsq\x01\x86q\x02K\x01\x85q\x03\x86q\x04.":
                  have: "?.?"
                  want: "ZODB.tests.testSerialize.ClassWithNewargs"
      
      This happens because classname decoder was expecting class and module
      names to come as py2 bytestrings, while py3 emits them as unicode.
      
      -> Fix it by accepting both py2 bytestring and unicode for class and
      module names.
      a865708a
    • Kirill Smelkov's avatar
      go/zodb: Test PyData and Map/List on all py2/py3 kinds of data we care about · ed2b96ca
      Kirill Smelkov authored
      Previously we were testing PyData and Map/List only with data generated by python2
      and pickle protocol=2. However even on py2 there are more pickle
      protocols that are in use, and also there is python3.
      
      -> Modernize py/pydata-gen-testdata to use run_with_all_zodb_pickle_kinds
         that was recently added as part of nexedi/zodbtools@f9d36ba7
         and generate test data with both python2 and python3. It is handy to
         use py2py3-venv(*) to prepare python environment to do that.
      
         Adjust tests on Go side to verify how PyData and Map/List handle all generated
         zkinds.
      
      py2_pickle1, py2_pickle2 and py2_pickle3 are handled well.
      Tests for py3_pickle3 currently fail and so are marked with "xfail".
      
      We will fix tests for py3_pickle3 in the next patches.
      
      (*) see nexedi/zodbtools@fac2f190
      ed2b96ca
  2. 18 Sep, 2024 5 commits
    • Kirill Smelkov's avatar
      X go.mod: Adjust it for updated zodb/internal/weak · efde5253
      Kirill Smelkov authored
      - go 1.14 -> 1.18 as we now use generics
      - `go get go4.org/unsafe/assume-no-moving-gc` since weak package uses that
      - `go mod tidy` as go build/test now was requesting this
      efde5253
    • Kirill Smelkov's avatar
      Merge branch 'master' into t · bf9c82e7
      Kirill Smelkov authored
      This replaces previous attempts to fix zodb/internal/weak with patches
      from master.
      
      * master:
        go/zodb/internal/weak: Disable support for weak references
        go/zodb/internal/weak: Assert that the object was not moved in the finalizer
        go/zodb/internal/weak: Try to fix GC crashes via reworking Ref to keep only one word instead of two
      bf9c82e7
    • Kirill Smelkov's avatar
      go/zodb/internal/weak: Disable support for weak references · ee23551d
      Kirill Smelkov authored
      Experience shows that we cannot provide reliably working weak references
      support without crashing GC because there needs to be a coordination in
      between object resurrection and at least GC mark phase[1,2] with [3] and
      [4] showing an idea what kind it coordination it needs to be.
      
      And since support for weak references is upcoming in standard library
      with hopefully Go 1.24 [5], it does not make sense to continue pushing
      hard to make weak references work on our standalone side.
      
      By disabling support for custom weak references we will avoid crashes,
      but will start to leak unused objects in zodb.Connection live cache.
      This should be relatively ok for now as WCFS in wendelin.core is
      currently the only known ZODB/go user and it cares to drop ZBlk* state
      after loading the data. The leak, thus, should be modest with,
      hopefully, not creating problems in practice.
      
      And once std package weak is out there we will switch to that restoring
      automatic LiveCache cleanup.
      
      [1] https://github.com/golang/go/issues/41303
      [2] wendelin.core@9b44fc23
      [3] https://github.com/golang/go/commit/dfc86e922cd0
      [4] https://github.com/golang/go/commit/79fd633632cd
      [5] https://github.com/golang/go/issues/67552
      
      /reviewed-by @levin.zimmermann
      /reviewed-on !11
      ee23551d
    • Kirill Smelkov's avatar
      go/zodb/internal/weak: Assert that the object was not moved in the finalizer · c48f7008
      Kirill Smelkov authored
      This is good to do since we are relying on non-moving property of
      current GC. It never triggered but it removed one uncertainty while
      debugging GC crash issue. Good to have it in place.
      
      /reviewed-by @levin.zimmermann
      /reviewed-on !11
      c48f7008
    • Kirill Smelkov's avatar
      go/zodb/internal/weak: Try to fix GC crashes via reworking Ref to keep only one word instead of two · 55491547
      Kirill Smelkov authored
      We are observing garbage-collector crashes due to weak package under
      load(*) with GC crashing as e.g.
      
          runtime: full=0xc0001f10000005 next=205 jobs=204 nDataRoots=1 nBSSRoots=1 nSpanRoots=16 nStackRoots=184
          panic: non-empty mark queue after concurrent mark
          fatal error: panic on system stack
      
          runtime stack:
          runtime.throw({0x5c60fe?, 0x601d70?})
                  /home/kirr/src/tools/go/go1.21/src/runtime/panic.go:1077 +0x5c fp=0xc000051e88 sp=0xc000051e58 pc=0x436efc
          panic({0x585100?, 0x601d70?})
                  /home/kirr/src/tools/go/go1.21/src/runtime/panic.go:840 +0x6ea fp=0xc000051f38 sp=0xc000051e88 pc=0x436e0a
          runtime.gcMark(0x118946?)
                  /home/kirr/src/tools/go/go1.21/src/runtime/mgc.go:1464 +0x40c fp=0xc000051fb0 sp=0xc000051f38 pc=0x41bd6c
          runtime.gcMarkTermination.func1()
                  /home/kirr/src/tools/go/go1.21/src/runtime/mgc.go:962 +0x17 fp=0xc000051fc8 sp=0xc000051fb0 pc=0x41b277
          runtime.systemstack()
                  /home/kirr/src/tools/go/go1.21/src/runtime/asm_amd64.s:509 +0x4a fp=0xc000051fd8 sp=0xc000051fc8 pc=0x468eea
      
      One problem in current implementation is that weak.Ref keeps two words
      for the copy of original interface object and recreates that interface
      object on Ref.Get from those two words _nonatomically_. Which is
      explicitly documented by Go memory model as something that can lead to
      corrupted memory and crashes. From https://go.dev/ref/mem#restrictions:
      
          This means that races on multiword data structures can lead to
          inconsistent values not corresponding to a single write. When the values
          depend on the consistency of internal (pointer, length) or (pointer,
          type) pairs, as can be the case for interface values, maps, slices, and
          strings in most Go implementations, such races can in turn lead to
          arbitrary memory corruption.
      
      We can avoid doing multiword writes during object resurrection by using
      concrete type T* instead of interface{}.
      
      Unfortunately as wendelin.core@9b44fc23
      shows it does not help with the above issue and the GC continues to
      crash with the same "panic: non-empty mark queue after concurrent mark"
      message. This is because weak.Ref implementation needs tight integration
      with concurrent GC that Go does and in practice we are unable to do that
      from outside of Go runtime internals. See e.g.
      https://github.com/golang/go/commit/dfc86e922cd0 and
      https://github.com/golang/go/commit/79fd633632cd to get an idea what
      kind of integration that needs to be.
      
      Anyway before disabling weak references support I wanted to commit this
      change to show that one-word approach was tried and it does not work.
      This leaves a trace in the history.
      
      On the good side we are going to use standard package weak in the future
      hopefully (https://github.com/golang/go/issues/67552). That needs to
      wait for at least Go 1.24 though.
      
      (*) see https://github.com/golang/go/issues/41303 for details
      
      /reviewed-by @levin.zimmermann
      /reviewed-on !11
      55491547
  3. 20 Aug, 2024 12 commits
    • Kirill Smelkov's avatar
      Merge branch 'master' into t · e7fde147
      Kirill Smelkov authored
      * master:
        go/neo/proto: Update 'Compression' to int to support different compression algorithms
      e7fde147
    • Levin Zimmermann's avatar
      go/neo/proto: Update 'Compression' to int to support different compression algorithms · 8e00bfdc
      Levin Zimmermann authored
      With nexedi/neoppod@fd80cc30 NEO/py added support to encode the compression
      algorithm with the 'Compression' parameter. Before this, compression could
      only be true (= with compression) or false (= without compression). Now
      the absence of compression is encoded with 0. Any other number than 0
      encodes a compression algorithm. The mapping is currently:
      
      	1 = zlib
      
      In the future, 2 could mean zstd [1].
      
      [1] https://github.com/facebook/zstd/issues/1134
      
      /reviewed-by @kirr
      /reviewed-on !6
      8e00bfdc
    • Levin Zimmermann's avatar
      go/neo/proto: Update 'Compression' to int to support different compression algorithms · 12a6a923
      Levin Zimmermann authored
      With nexedi/neoppod@fd80cc30 NEO/py added support to encode the compression
      algorithm with the 'Compression' parameter. Before this, compression could
      only be true (= with compression) or false (= without compression). Now
      the absence of compression is encoded with 0. Any other number than 0
      encodes a compression algorithm. The mapping is currently:
      
      	1 = zlib
      
      In the future, 2 could mean zstd [1].
      
      [1] https://github.com/facebook/zstd/issues/1134
      
      /reviewed-by @kirr
      /reviewed-on !6
      12a6a923
    • Levin Zimmermann's avatar
      identification: Set possible answers to same msgid · 20b37b60
      Levin Zimmermann authored
      When a node tries to connect to another node it initially sends a
      'RequestIdentification' packet. The other node can either reply with
      'AcceptIdentification' or in case of a secondary master with
      'NotPrimaryMaster'.
      
      In the second case the message id differs from the initial requests
      message id. This makes it difficult in a multi-threaded implementation
      to proceed this answer: due to the different msg/connection - id the
      multi-threaded implementation tries to proceed this incoming message in
      a different thread, while the requesting thread waits forever for its
      peers reply.
      
      The most straightforward solution is to use the same connection - id for
      both possible answers to the 'RequestIdentification' packet. This
      doesn't break given NEO/py implementation and is only a small patch.
      A workaround in a multi-threaded implementation on the other hand
      seems to be much more complicated and time-consuming. Finally it also makes
      sense semantically, because "Message IDs are used to identify response
      packets" and in the given context the 'NotPrimaryMaster' *is* the de facto
      response of a 'RequestIdentification'.
      
      /suggested-at !2
      /proposed-for-review-at nexedi/neoppod!20
      20b37b60
    • Kirill Smelkov's avatar
      go/internal/xtesting: Prepare to run user's tests with both py2 and py3 · 15401594
      Kirill Smelkov authored
      Tests, that want to verify something wrt both py2 and py3 will do
      WithEachPy(tested_func) which will run tested_func two times:
      
      1) under environment where python on $PATH points to python2, and
      2) under environment where python on $PATH points to python3.
      
      Client tests will now need to use just "python" instead of e.g.
      "python2" or "python3" to run some python program, and each time
      "python" will correspond to current phase of WithEachPy execution.
      
      This will be soon handy to test things wrt e.g. ZEO/py2 and ZEO/py3
      servers and similar situations.
      
      Tests that merely want to use some python program just for their inner
      working, for example to run `zodb commit`, no longer indicate their
      preference for py2. For such tests it is a matter of preference in
      pre-setup environment to where "python" points.
      
      For the reference under environments created with py2py3-venv(*) default
      "python" currently points to "python2".
      
      (*) see nexedi/zodbtools@fac2f190
      15401594
    • Kirill Smelkov's avatar
      go/internal/xtesting += ZTestData & friends · 07e1263a
      Kirill Smelkov authored
      ZTestData is Go counterpart of ztestdata on zodbtools/py side from
      https://lab.nexedi.com/nexedi/zodbtools/-/blob/f9d36ba7/zodbtools/test/conftest.py#L31-65
      that was recently added in nexedi/zodbtools@bf772ce0.
      
      On Go side each ZTestData describes testdata files on the filesystem +
      carry arbitrary extra information. For example FileStorage tests will
      use it to place there "what is expected to load" information
      corresponding to on-filesystem database.
      07e1263a
    • Kirill Smelkov's avatar
      go/internal/xmaps: New package that complements standard package maps · 28489790
      Kirill Smelkov authored
      Currently with the only utility to retrieve map keys in sorted order.
      
      We will need this in several tests in follow-up patches.
      28489790
    • Kirill Smelkov's avatar
      go/zodbpickle: Factor/generalize pickletools.Xstrbytes8 into here as zodbpickle.Unpack64 · 8da1234e
      Kirill Smelkov authored
      Similarly to ZODB/py which provides u64 and p64 utilities. Add
      zodbpickle.Pack64 for symmetry.
      8da1234e
    • Kirill Smelkov's avatar
      go/*: Switch to StrictUnicode mode for pickling/unpickling · 078b6262
      Kirill Smelkov authored
      Because without StrictUnicode both bytestrings (str from py2) and
      unicode (unicode from py2 and str from py3) load into the same Go type
      string so it becomes impossible to distinguish them from each other.
      Re-saving data, thus, also generally introduces changes as e.g.
      string loaded via bytestring will be saved as unicode when pickling with
      protocol 3. Or a loaded unicode will be saved as bytestring with
      pickling via protocol=2.
      
      -> Switching to StrictUnicode mode solves all those problems.
      
      Please see updated documentatin for zodbpickle.go and
      https://github.com/kisielk/og-rek/commit/b28613c2 for more details about
      StrictUnicode mode.
      
      For ZODB/go this is change in behaviour exposed to outside. However
      there is currently only one known ZODB/go user - WCFS in Wendelin.core -
      and that user will be updated correspondingly as well.
      078b6262
    • Kirill Smelkov's avatar
      go/*: Move creation of a Pickler and Unpickler to central place · 02076be2
      Kirill Smelkov authored
      We already have 3 places where we create picklers and unpicklers:
      zodb/pydata.go, fs1/index.go and zeo/proto.go and they are already
      diverging a bit: for example pydata was explicitly specifying protocol
      for pickle encoder, while other places were using defaults.
      
      We will soon want to use StrictUnicode option for all picklers and
      unpicklers we create, which means it is better to keep this
      customization only in one place instead of copy-pasting it in between
      callsites increasing the possibility for divergence and errors.
      
      -> Move the code to create pickler and unpickler to internal zodbpickle
      package as a preparatory step for that.
      
      For the reference: this package is merely a wrapper around ogórek, not a
      fork of any code like zodbpickle/py is.
      02076be2
    • Kirill Smelkov's avatar
      go/go.mod: v↑ go 1.14 -> go 1.18 · 8bc46475
      Kirill Smelkov authored
      We will soon want to use generics and `any` which are available only for go >= 1.18 .
      
      The other changes in go.sum and go.mod are there because after just
      
          -go 1.14
          +go 1.18
      
      the build stopped to work and asked to do `go mod tidy`
      8bc46475
    • Kirill Smelkov's avatar
      go/*: Switch pickletools.Xint64 to ogórek.AsInt64 · e0063782
      Kirill Smelkov authored
      ogórek.AsInt64 was added in https://github.com/kisielk/og-rek/commit/010fbd2e
      with functionality similar to what we had in pickletools.Xint64 before.
      e0063782