Commit b9070f8f authored by Kirill Smelkov's avatar Kirill Smelkov

Merge branch 'y/restore' into x/catobj

* y/restore: (56 commits)
  zodbrestore - Tool to restore content of a ZODB database from zodbdump output
  zodbcommit: Prepare to compute current serial of an oid lazily
  zodbcommit: Don't forget to call tpc_abort on an error
  Drop support for ZODB3
  tox: Don't run tests agains ZODB+PR183 anymore
  Add way to run tests via nxdtest
  tidrange: test: Fix for py3
  *: dict.keys() returns sequence, not [] on py3
  *: Pass bytes literal into BytesIO
  zodbdump: Use bytes to emit its output
  *: Zodbdump format is semi text-binary: Mark it as such + handle zdump output as binary
  *: Don't use %r to print/report lines/bytes to outside
  zodbinfo: Provide "head" as command to query DB head; Turn "last_tid" into deprecated alias for head
  test/gen_testdata: Fix for ZODB5 > 5.5.1 + preserve database compatibility with ZODB3/py2
  tox: Don't duplicate setup.py on which for-tests dependencies we need
  zodbdump: Default out to stdout in binary mode
  *: s.decode('hex') -> fromhex(s)
  utils: Initialize hashers with bytes
  *: Pass bytes - not unicode - literals to sha1()
  util: Fix ashex for Python3
  ...
parents 6bac5920 198b8df4
/dist
/zodbtools.egg-info
/.tox/
# setup to run tests on Nexedi testing infrastructure.
# https://stack.nexedi.com/test_status
TestCase('pytest', ['python', '-m', 'pytest'], summaryf=PyTest.summary)
Zodbtools change history
========================
0.0.0.dev8 (2019-03-07)
-----------------------
- Support using absolute and relative time in tidrange. One example usage is:
``zodb analyze data.fs 2018-01-01T10:30:00Z..yesterday`` (commit__).
__ https://lab.nexedi.com/nexedi/zodbtools/commit/4037002c
- Python3 support progressed (`commit 1`__, 2__, 3__), but zodbtools does not
support python3 yet. The test suite was extended to run on python3 (commit__)
and also was extended to also run on ZODB with raw extensions from ongoing
pull request `#183`__ (commit__).
__ https://lab.nexedi.com/nexedi/zodbtools/commit/d6bde57c
__ https://lab.nexedi.com/nexedi/zodbtools/commit/f16ccfd4
__ https://lab.nexedi.com/nexedi/zodbtools/commit/b338d004
__ https://lab.nexedi.com/nexedi/zodbtools/commit/eaa3aec7
__ https://github.com/zopefoundation/ZODB/pull/183
__ https://lab.nexedi.com/nexedi/zodbtools/commit/c50bfb00
0.0.0.dev7 (2019-01-11)
-----------------------
- Fix zodbtools to work with all ZODB3, ZODB4 and ZODB5 (`commit 1`__, 2__,
3__, 4__).
__ https://lab.nexedi.com/nexedi/zodbtools/commit/425e6656
__ https://lab.nexedi.com/nexedi/zodbtools/commit/0e5d2f81
__ https://lab.nexedi.com/nexedi/zodbtools/commit/7a94e312
__ https://lab.nexedi.com/nexedi/zodbtools/commit/8ff7020c
- Fix `zodb analyze` for the case when history range is empty (`commit 1`__,
2__, 3__).
__ https://lab.nexedi.com/nexedi/zodbtools/commit/b4824ad5
__ https://lab.nexedi.com/nexedi/zodbtools/commit/d37746c6
__ https://lab.nexedi.com/nexedi/zodbtools/commit/474a0559
- Zodbtools is not yet Python3-ready (commit__), but we started to fix it
step-by-step (`commit 1`__, 2__, 3__, 4__).
__ https://lab.nexedi.com/nexedi/zodbtools/commit/7c5bb0b5
__ https://lab.nexedi.com/nexedi/zodbtools/commit/7d24147b
__ https://lab.nexedi.com/nexedi/zodbtools/commit/55853615
__ https://lab.nexedi.com/nexedi/zodbtools/commit/79aa0c45
__ https://lab.nexedi.com/nexedi/zodbtools/commit/5e2ed5e7
0.0.0.dev6 (2018-12-30)
-----------------------
- `zodb analyze` can now work with any ZODB storage and supports analyzing a
particular range of history (`commit 1`__, 2__).
__ https://lab.nexedi.com/nexedi/zodbtools/commit/3ce22f28
__ https://lab.nexedi.com/nexedi/zodbtools/commit/7ad9e1df
- Add help for specifying TID ranges (commit__).
__ https://lab.nexedi.com/nexedi/zodbtools/commit/f7eff5fe
- Always close opened storages (commit__).
__ https://lab.nexedi.com/nexedi/zodbtools/commit/9dbe70f3
0.0.0.dev5 (2018-12-13)
-----------------------
- Start to stabilize `zodb dump` format. The format is close to be stable now
and will likely be changed, if at all, only in minor ways (`commit 1`__, 2__,
3__, 4__).
__ https://lab.nexedi.com/nexedi/zodbtools/commit/75c03368
__ https://lab.nexedi.com/nexedi/zodbtools/commit/33230940
__ https://lab.nexedi.com/nexedi/zodbtools/commit/7f0bbf7e
__ https://lab.nexedi.com/nexedi/zodbtools/commit/624aeb09
- Add `DumpReader` - class to read/parse input in `zodbdump` format (commit__).
__ https://lab.nexedi.com/nexedi/zodbtools/commit/dd959b28
- Add `zodb commit` subcommand to commit new transaction into ZODB (commit__).
__ https://lab.nexedi.com/nexedi/zodbtools/commit/960c5e17
0.0.0.dev4 (2017-04-05)
-----------------------
- Clarify licensing (`commit 1`__, 2__).
__ https://lab.nexedi.com/nexedi/zodbtools/commit/9e4305b8
__ https://lab.nexedi.com/nexedi/zodbtools/commit/79cf177a
- Add `zodb` tool to drive all subcommands (commit__).
__ https://lab.nexedi.com/nexedi/zodbtools/commit/984cfe22
- Add `zodb info` subcommand to print general information about a ZODB database
(commit__).
__ https://lab.nexedi.com/nexedi/zodbtools/commit/37b9fbde
- Switch to open ZODB storages by URL, not only via ZConfig files. URL support
comes from `zodburi` (`commit 1`__, 2__).
__ https://lab.nexedi.com/nexedi/zodbtools/commit/82b06413
__ https://lab.nexedi.com/nexedi/zodbtools/commit/bfeb1690
0.0.0.dev3 (2016-11-17)
-----------------------
- Move Nexedi version of `zodbanalyze` from ERP5 into zodbtools.
Compared to original `zodbanalyze` Nexedi version is faster, prints not only
total, but also current sizes, and supports running on bigger databases where
keeping all working set to analyze in RAM is not feasible. It also supports
analyzing a Repozo deltafs file directly.
(`commit 1`__, 2__, 3__, 4__, 5__, 6__, 7__, 8__, 9__)
__ https://lab.nexedi.com/nexedi/zodbtools/commit/ab17cf2d
__ https://lab.nexedi.com/nexedi/zodbtools/commit/1e506a81
__ https://lab.nexedi.com/nexedi/zodbtools/commit/d86d04dc
__ https://lab.nexedi.com/nexedi/zodbtools/commit/5fd2c0eb
__ https://lab.nexedi.com/nexedi/zodbtools/commit/a9346784
__ https://lab.nexedi.com/nexedi/zodbtools/commit/1a489502
__ https://lab.nexedi.com/nexedi/zodbtools/commit/8dc37247
__ https://lab.nexedi.com/nexedi/zodbtools/commit/e4d4762a
__ https://lab.nexedi.com/nexedi/zodbtools/commit/2e834aaf
0.0.0.dev2 (2016-11-17)
-----------------------
- Add initial draft of `zodbdump` - tool to dump content of a ZODB database
(`commit 1`__, 2__).
__ https://lab.nexedi.com/nexedi/zodbtools/commit/c0a6299f
__ https://lab.nexedi.com/nexedi/zodbtools/commit/d955f79a
0.0.0.dev1 (2016-11-16)
-----------------------
- Initial release of zodbtools with `zodbcmp` (`commit 1`__, 2__, 3__).
We originally tried to put `zodbcmp` into ZODB itself, but Jim Fulton asked__
not to load ZODB with scripts anymore. This way zodbtools was created.
__ https://lab.nexedi.com/nexedi/zodbtools/commit/fd6ad1b9
__ https://lab.nexedi.com/nexedi/zodbtools/commit/66a03ae5
__ https://lab.nexedi.com/nexedi/zodbtools/commit/66946b8d
__ https://github.com/zopefoundation/ZODB/pull/128#issuecomment-260970932
include COPYING LICENSE-ZPL.txt README.rst CHANGELOG.rst tox.ini
recursive-include zodbtools/test/testdata *.fs *.index *.ok *.txt
......@@ -8,7 +8,9 @@ scripts anymore. So we are here:
__ https://github.com/zopefoundation/ZODB/pull/128#issuecomment-260970932
- `zodb analyze` - analyze FileStorage or repozo deltafs usage.
- `zodb analyze` - analyze ZODB database or repozo deltafs usage.
- `zodb cmp` - compare content of two ZODB databases bit-to-bit.
- `zodb commit` - commit new transaction into a ZODB database.
- `zodb dump` - dump content of a ZODB database.
- `zodb restore` - restore content of a ZODB database.
- `zodb info` - print general information about a ZODB database.
......@@ -8,9 +8,10 @@ def readfile(path):
setup(
name = 'zodbtools',
version = '0.0.0.dev4',
version = '0.0.0.dev8',
description = 'ZODB-related utilities',
long_description = readfile('README.rst'),
long_description = '%s\n----\n\n%s' % (
readfile('README.rst'), readfile('CHANGELOG.rst')),
url = 'https://lab.nexedi.com/nexedi/zodbtools',
license = 'GPLv3+ with wide exception for Open-Source; ZPL 2.1',
author = 'Nexedi + Zope Foundation + Community',
......@@ -19,23 +20,21 @@ setup(
keywords = 'zodb utility tool',
packages = find_packages(),
install_requires = ['ZODB', 'zodburi', 'pygolang >= 0.0.0.dev3', 'six'],
install_requires = ['ZODB', 'zodburi', 'zope.interface', 'pygolang >= 0.0.0.dev6', 'six', 'dateparser'],
extras_require = {
'test': ['pytest'],
'test': ['pytest', 'freezegun', 'pytz', 'mock;python_version<="2.7"'],
},
entry_points= {'console_scripts': ['zodb = zodbtools.zodb:main']},
# FIXME restore py3 support
classifiers = [_.strip() for _ in """\
Development Status :: 3 - Alpha
Intended Audience :: Developers
Operating System :: POSIX :: Linux
Programming Language :: Python :: 2
Programming Language :: Python :: 2.7
Programming Language :: Python :: 3
Programming Language :: Python :: 3.4
Programming Language :: Python :: 3.5
Topic :: Database
Topic :: Utilities
Framework :: ZODB\
......
# zodbtools | tox setup
[tox]
envlist = py{27,36,37}-ZODB{4,5}
[testenv]
deps =
.[test]
# latest current ZODB 4
ZODB4: ZODB >=4.0, <5.0dev
ZODB4: ZEO >=4.0, <5.0dev
# ZEO4 depends on transaction <2
ZODB4: transaction <2.0dev
# latest current ZODB 5
ZODB5: ZODB >=5.6, <6.0dev
ZODB5: ZEO >=5.0, <6.0dev
commands= {envpython} -m pytest
# -*- coding: utf-8 -*-
# zodbtools - help topics
# Copyright (C) 2017 Nexedi SA and Contributors.
# Kirill Smelkov <kirr@nexedi.com>
# Copyright (C) 2017-2018 Nexedi SA and Contributors.
# Kirill Smelkov <kirr@nexedi.com>
#
# This program is free software: you can Use, Study, Modify and Redistribute
# it under the terms of the GNU General Public License version 3, or (at your
......@@ -51,4 +52,39 @@ Please see zodburi documentation for full details:
http://docs.pylonsproject.org/projects/zodburi/
"""
topic_dict['zurl'] = "specifying database URL", help_zurl
help_tidrange = """\
Many zodb commands can be invoked on specific range of database history and
accept <tidrange> parameter for that. The syntax for <tidrange> is
tidmin..tidmax
where tidmin and tidmax specify [tidmin, tidmax] range of transactions, ends
inclusive. Both tidmin and tidmax are optional and default to
tidmin: 0 (start of database history)
tidmax: +∞ (end of database history)
If a tid (tidmin or tidmax) is given, it has to be specified as follows:
- a 16-digit hex number specifying transaction ID, e.g. 0285cbac258bf266
- absolute timestamp, in RFC3339 or RFC822 formats
- relative timestamp, e.g. yesterday, 1 week ago
Example tid ranges:
.. whole database history
000000000000aaaa.. transactions starting from 000000000000aaaa till latest
..000000000000bbbb transactions starting from database beginning till 000000000000bbbb
000000000000aaaa..000000000000bbbb transactions starting from 000000000000aaaa till 000000000000bbbb
1985-04-12T23:20:50.52Z..2018-01-01T10:30:00Z
transactions starting from 1985-04-12 at 23 hours
20 minutes 50 seconds and 520000000 nano seconds
in UTC till 2018-01-01 at 10 hours 30 minutes in UTC
1_week_ago..yesterday transactions from one week ago until yesterday.
In commands <tidrange> is optional - if it is not given at all, it defaults to
0..+∞, i.e. to whole database history.
"""
topic_dict['zurl'] = "specifying database URL", help_zurl
topic_dict['tidrange'] = "specifying history range", help_tidrange
# Copyright (C) 2019 Nexedi SA and Contributors.
# Kirill Smelkov <kirr@nexedi.com>
#
# This program is free software: you can Use, Study, Modify and Redistribute
# it under the terms of the GNU General Public License version 3, or (at your
# option) any later version, as published by the Free Software Foundation.
#
# You can also Link and Combine this program with other software covered by
# the terms of any of the Free Software licenses or any of the Open Source
# Initiative approved licenses and Convey the resulting work. Corresponding
# source of such a combination shall include the source code for all other
# software used.
#
# This program is distributed WITHOUT ANY WARRANTY; without even the implied
# warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.
#
# See COPYING file for full licensing terms.
# See https://www.nexedi.com/licensing for rationale and options.
import pytest
from zodbtools.test.testutil import zext_supported
# zext is a test fixture function object that allows to exercise 2 cases:
#
# - when ZODB does not have txn.extension_bytes support
# - when ZODB might have txn.extension_bytes support
#
# in a test, zext should be used as as follows:
#
# def test_something(zext):
# # bytes for an extension dict
# raw_ext = dumps({...})
#
# # will be either same as raw_ext, or b'' if ZODB lacks txn.extension_bytes support
# raw_ext = zext(raw_ext)
#
# # zext.disabled indicates whether testing for non-empty extension was disabled.
# if zext.disabled:
# ...
@pytest.fixture(params=['!zext', 'zext'])
def zext(request):
if request.param == '!zext':
# txn.extension_bytes is not working - always test with empty extension
def _(ext):
return b''
_.disabled = True
return _
else:
# txn.extension_bytes might be working - test with given extension and
# xfail if ZODB does not have necessary support.
def _(ext):
return ext
_.disabled = False
if not zext_supported():
request.applymarker(pytest.mark.xfail(reason='ZODB does not have txn.extension_bytes support'))
return _
#!/usr/bin/env python
# -*- coding: utf-8 -*-
# Copyright (C) 2017 Nexedi SA and Contributors.
# Kirill Smelkov <kirr@nexedi.com>
# Copyright (C) 2017-2021 Nexedi SA and Contributors.
# Kirill Smelkov <kirr@nexedi.com>
#
# This program is free software: you can Use, Study, Modify and Redistribute
# it under the terms of the GNU General Public License version 3, or (at your
......@@ -39,6 +39,7 @@
from ZODB.FileStorage import FileStorage
from ZODB import DB
from ZODB.Connection import TransactionMetaData
from ZODB.POSException import UndoError
from persistent import Persistent
import transaction
......@@ -60,7 +61,12 @@ def hex64(packed):
return '0x%016x' % unpack64(packed)
# make time.time() predictable
_xtime = time.mktime(time.strptime("04 Jan 1979", "%d %b %Y"))
_xtime0 = time.mktime(time.strptime("04 Jan 1979", "%d %b %Y"))
def xtime_reset():
global _xtime
_xtime = _xtime0
xtime_reset()
def xtime():
global _xtime
_xtime += 1.1
......@@ -94,7 +100,7 @@ class Object(Persistent):
# prepare extension dictionary for subject
alnum = "0123456789ABCDEFGHIJKLMNOPQRSTUVWXYZ"
def ext(subj):
def ext4subj(subj):
d = {"x-generator": "zodb/py%s (%s)" % (sys.version_info.major, subj)}
# also add some random 'x-cookie'
......@@ -107,7 +113,7 @@ def ext(subj):
d[xcookie] = cookie
# shufle extension dict randomly - to likely trigger different ordering on save
keyv = d.keys()
keyv = list(d.keys())
random.shuffle(keyv)
ext = {}
for key in keyv:
......@@ -115,8 +121,66 @@ def ext(subj):
return ext
# gen_testdb generates test FileStorage database @ outfs_path
def gen_testdb(outfs_path):
# run_with_zodb4py2_compat(f) runs f preserving database compatibility with
# ZODB4/py2, which generates pickles encoded with protocol < 3.
#
# ZODB5 started to use protocol 3 and binary for oids starting from ZODB 5.4.0:
# https://github.com/zopefoundation/ZODB/commit/12ee41c4
# Undo it, while we generate test database.
def run_with_zodb4py2_compat(f):
import ZODB.ConflictResolution
import ZODB.Connection
import ZODB.ExportImport
import ZODB.FileStorage.FileStorage
import ZODB._compat
import ZODB.broken
import ZODB.fsIndex
import ZODB.serialize
binary = getattr(ZODB.serialize, 'binary', None)
_protocol = getattr(ZODB.serialize, '_protocol', None)
Pz4 = 2
try:
ZODB.serialize.binary = bytes
# XXX cannot change just ZODB._compat._protocol, because many modules
# do `from ZODB._compat import _protocol` and just `import ZODB`
# imports many ZODB.X modules. In other words we cannot change
# _protocol just in one place.
ZODB.ConflictResolution._protocol = Pz4
ZODB.Connection._protocol = Pz4
ZODB.ExportImport._protocol = Pz4
ZODB.FileStorage.FileStorage._protocol = Pz4
ZODB._compat._protocol = Pz4
ZODB.broken._protocol = Pz4
ZODB.fsIndex._protocol = Pz4
ZODB.serialize._protocol = Pz4
f()
finally:
ZODB.serialize.binary = binary
ZODB.ConflictResolution._protocol = _protocol
ZODB.Connection._protocol = _protocol
ZODB.ExportImport._protocol = _protocol
ZODB.FileStorage.FileStorage._protocol = _protocol
ZODB._compat._protocol = _protocol
ZODB.broken._protocol = _protocol
ZODB.fsIndex._protocol = _protocol
ZODB.serialize._protocol = _protocol
# gen_testdb generates test FileStorage database @ outfs_path.
#
# zext indicates whether or not to include non-empty extension into transactions.
def gen_testdb(outfs_path, zext=True):
def _():
_gen_testdb(outfs_path, zext)
run_with_zodb4py2_compat(_)
def _gen_testdb(outfs_path, zext):
xtime_reset()
ext = ext4subj
if not zext:
def ext(subj): return {}
logging.basicConfig()
# generate random changes to objects hooked to top-level root by a/b/c/... key
......@@ -163,7 +227,7 @@ def gen_testdb(outfs_path):
break
# delete an object
name = random.choice(root.keys())
name = random.choice(list(root.keys()))
obj = root[name]
root[name] = Object("%s%i*" % (name, i))
# NOTE user/ext are kept empty on purpose - to also test this case
......@@ -178,14 +242,16 @@ def gen_testdb(outfs_path):
''.join(chr(_) for _ in range(32)), # <- NOTE all control characters
u"delete %i\nalpha beta gamma'delta\"lambda\n\nqqq ..." % i,
ext("delete %s" % unpack64(obj._p_oid)))
stor.tpc_begin(txn)
stor.deleteObject(obj._p_oid, obj_tid_lastchange, txn)
stor.tpc_vote(txn)
# at low level stor requires ZODB.IStorageTransactionMetaData not txn (ITransaction)
txn_stormeta = TransactionMetaData(txn.user, txn.description, txn.extension)
stor.tpc_begin(txn_stormeta)
stor.deleteObject(obj._p_oid, obj_tid_lastchange, txn_stormeta)
stor.tpc_vote(txn_stormeta)
# TODO different txn status vvv
# XXX vvv it does the thing, but py fs iterator treats this txn as EOF
#if i != Niter-1:
# stor.tpc_finish(txn)
stor.tpc_finish(txn)
# stor.tpc_finish(txn_stormeta)
stor.tpc_finish(txn_stormeta)
# close db & rest not to get conflict errors after we touched stor
# directly a bit. everything will be reopened on next iteration.
......@@ -196,13 +262,22 @@ def gen_testdb(outfs_path):
# ----------------------------------------
from zodbtools.zodbdump import zodbdump
from zodbtools.test.testutil import zext_supported
def main():
# check that ZODB supports txn.extension_bytes; refuse to work if not.
if not zext_supported():
raise RuntimeError("gen_testdata must be used with ZODB that supports txn.extension_bytes")
out = "testdata/1"
gen_testdb("%s.fs" % out)
stor = FileStorage("%s.fs" % out, read_only=True)
with open("%s.zdump.ok" % out, "w") as f:
zodbdump(stor, None, None, out=f)
for zext in [True, False]:
dbname = out
if not zext:
dbname += "_!zext"
gen_testdb("%s.fs" % dbname, zext=zext)
stor = FileStorage("%s.fs" % dbname, read_only=True)
with open("%s.zdump.ok" % dbname, "wb") as f:
zodbdump(stor, None, None, out=f)
if __name__ == '__main__':
main()
# -*- coding: utf-8 -*-
# Copyright (C) 2019 Nexedi SA and Contributors.
#
# This program is free software: you can Use, Study, Modify and Redistribute
# it under the terms of the GNU General Public License version 3, or (at your
# option) any later version, as published by the Free Software Foundation.
#
# You can also Link and Combine this program with other software covered by
# the terms of any of the Free Software licenses or any of the Open Source
# Initiative approved licenses and Convey the resulting work. Corresponding
# source of such a combination shall include the source code for all other
# software used.
#
# This program is distributed WITHOUT ANY WARRANTY; without even the implied
# warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.
#
# See COPYING file for full licensing terms.
# See https://www.nexedi.com/licensing for rationale and options.
from zodbtools.zodbanalyze import analyze, report
import os.path
def test_zodbanalyze(capsys):
for use_dbm in (False, True):
report(
analyze(
os.path.join(os.path.dirname(__file__), "testdata", "1.fs"),
use_dbm=use_dbm,
delta_fs=False,
tidmin=None,
tidmax=None,
),
csv=False,
)
captured = capsys.readouterr()
assert "Processed 68 records in 59 transactions" in captured.out
assert captured.err == ""
# csv output
report(
analyze(
os.path.join(os.path.dirname(__file__), "testdata", "1.fs"),
use_dbm=False,
delta_fs=False,
tidmin=None,
tidmax=None,
),
csv=True,
)
captured = capsys.readouterr()
assert (
"""Class Name,T.Count,T.Bytes,Pct,AvgSize,C.Count,C.Bytes,O.Count,O.Bytes
persistent.mapping.PersistentMapping,10,1578,45.633314%,157.800000,1,213,9,1365
__main__.Object,56,1880,54.366686%,33.571429,9,303,47,1577
"""
== captured.out
)
assert captured.err == ""
# empty range
report(
analyze(
os.path.join(os.path.dirname(__file__), "testdata", "1.fs"),
use_dbm=False,
delta_fs=False,
tidmin="ffffffffffffffff",
tidmax=None,
),
csv=False,
)
captured = capsys.readouterr()
assert "# ø\nNo transactions processed\n" == captured.out.encode('utf-8')
assert captured.err == ""
# -*- coding: utf-8 -*-
# Copyright (C) 2018-2020 Nexedi SA and Contributors.
# Kirill Smelkov <kirr@nexedi.com>
# Jérome Perrin <jerome@nexedi.com>
#
# This program is free software: you can Use, Study, Modify and Redistribute
# it under the terms of the GNU General Public License version 3, or (at your
# option) any later version, as published by the Free Software Foundation.
#
# You can also Link and Combine this program with other software covered by
# the terms of any of the Free Software licenses or any of the Open Source
# Initiative approved licenses and Convey the resulting work. Corresponding
# source of such a combination shall include the source code for all other
# software used.
#
# This program is distributed WITHOUT ANY WARRANTY; without even the implied
# warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.
#
# See COPYING file for full licensing terms.
# See https://www.nexedi.com/licensing for rationale and options.
from zodbtools.zodbcommit import zodbcommit
from zodbtools.zodbdump import zodbdump, Transaction, ObjectData, ObjectDelete, ObjectCopy
from zodbtools.util import storageFromURL, sha1
from ZODB.utils import p64, u64, z64
from ZODB._compat import BytesIO, dumps, _protocol # XXX can't yet commit with arbitrary ext.bytes
from tempfile import mkdtemp
from shutil import rmtree
from golang import func, defer
# verify zodbcommit.
@func
def test_zodbcommit(zext):
tmpd = mkdtemp('', 'zodbcommit.')
defer(lambda: rmtree(tmpd))
stor = storageFromURL('%s/2.fs' % tmpd)
defer(stor.close)
head = stor.lastTransaction()
# commit some transactions via zodbcommit and verify if storage dump gives
# what is expected.
t1 = Transaction(z64, ' ', b'user name', b'description ...', zext(dumps({'a': 'b'}, _protocol)), [
ObjectData(p64(1), b'data1', 'sha1', sha1(b'data1')),
ObjectData(p64(2), b'data2', 'sha1', sha1(b'data2'))])
t1.tid = zodbcommit(stor, head, t1)
t2 = Transaction(z64, ' ', b'user2', b'desc2', b'', [
ObjectDelete(p64(2))])
t2.tid = zodbcommit(stor, t1.tid, t2)
buf = BytesIO()
zodbdump(stor, p64(u64(head)+1), None, out=buf)
dumped = buf.getvalue()
assert dumped == b''.join([_.zdump() for _ in (t1, t2)])
# ObjectCopy. XXX zodbcommit handled ObjectCopy by actually copying data,
# not referencing previous transaction via backpointer.
t3 = Transaction(z64, ' ', b'user3', b'desc3', b'', [
ObjectCopy(p64(1), t1.tid)])
t3.tid = zodbcommit(stor, t2.tid, t3)
data1_1, _, _ = stor.loadBefore(p64(1), p64(u64(t1.tid)+1))
data1_3, _, _ = stor.loadBefore(p64(1), p64(u64(t3.tid)+1))
assert data1_1 == data1_3
assert data1_1 == b'data1' # just in case
# Copyright (C) 2017 Nexedi SA and Contributors.
# Kirill Smelkov <kirr@nexedi.com>
# -*- coding: utf-8 -*-
# Copyright (C) 2017-2020 Nexedi SA and Contributors.
# Kirill Smelkov <kirr@nexedi.com>
# Jérome Perrin <jerome@nexedi.com>
#
# This program is free software: you can Use, Study, Modify and Redistribute
# it under the terms of the GNU General Public License version 3, or (at your
......@@ -17,21 +19,129 @@
# See COPYING file for full licensing terms.
# See https://www.nexedi.com/licensing for rationale and options.
from zodbtools.zodbdump import zodbdump
from zodbtools.zodbdump import (
zodbdump, DumpReader, Transaction, ObjectDelete, ObjectCopy,
ObjectData, HashOnly
)
from zodbtools.util import fromhex
from ZODB.FileStorage import FileStorage
from cStringIO import StringIO
from ZODB.utils import p64
from io import BytesIO
from os.path import dirname
from zodbtools.test.testutil import zext_supported
from pytest import raises, xfail
# verify zodbdump output against golden
def test_zodbdump():
tdir = dirname(__file__)
stor = FileStorage('%s/testdata/1.fs' % tdir, read_only=True)
def test_zodbdump(zext):
tdir = dirname(__file__)
zkind = '_!zext' if zext.disabled else ''
stor = FileStorage('%s/testdata/1%s.fs' % (tdir, zkind), read_only=True)
with open('%s/testdata/1.zdump.ok' % tdir) as f:
with open('%s/testdata/1%s.zdump.ok' % (tdir, zkind), 'rb') as f:
dumpok = f.read()
out = StringIO()
out = BytesIO()
zodbdump(stor, None, None, out=out)
assert out.getvalue() == dumpok
# verify zodbdump.DumpReader
def test_dumpreader():
in_ = b"""\
txn 0123456789abcdef " "
user "my name"
description "o la-la..."
extension "zzz123 def"
obj 0000000000000001 delete
obj 0000000000000002 from 0123456789abcdee
obj 0000000000000003 54 adler32:01234567 -
obj 0000000000000004 4 sha1:9865d483bc5a94f2e30056fc256ed3066af54d04
ZZZZ
obj 0000000000000005 9 crc32:52fdeac5
ABC
DEF!
txn 0123456789abcdf0 " "
user "author2"
description "zzz"
extension "qqq"
"""
r = DumpReader(BytesIO(in_))
t1 = r.readtxn()
assert isinstance(t1, Transaction)
assert t1.tid == fromhex('0123456789abcdef')
assert t1.user == b'my name'
assert t1.description == b'o la-la...'
assert t1.extension_bytes == b'zzz123 def'
assert len(t1.objv) == 5
_ = t1.objv[0]
assert isinstance(_, ObjectDelete)
assert _.oid == p64(1)
_ = t1.objv[1]
assert isinstance(_, ObjectCopy)
assert _.oid == p64(2)
assert _.copy_from == fromhex('0123456789abcdee')
_ = t1.objv[2]
assert isinstance(_, ObjectData)
assert _.oid == p64(3)
assert _.data == HashOnly(54)
assert _.hashfunc == 'adler32'
assert _.hash_ == fromhex('01234567')
_ = t1.objv[3]
assert isinstance(_, ObjectData)
assert _.oid == p64(4)
assert _.data == b'ZZZZ'
assert _.hashfunc == 'sha1'
assert _.hash_ == fromhex('9865d483bc5a94f2e30056fc256ed3066af54d04')
_ = t1.objv[4]
assert isinstance(_, ObjectData)
assert _.oid == p64(5)
assert _.data == b'ABC\n\nDEF!'
assert _.hashfunc == 'crc32'
assert _.hash_ == fromhex('52fdeac5')
t2 = r.readtxn()
assert isinstance(t2, Transaction)
assert t2.tid == fromhex('0123456789abcdf0')
assert t2.user == b'author2'
assert t2.description == b'zzz'
assert t2.extension_bytes == b'qqq'
assert t2.objv == []
assert r.readtxn() == None
z = b''.join([_.zdump() for _ in (t1, t2)])
assert z == in_
# unknown hash function
r = DumpReader(BytesIO(b"""\
txn 0000000000000000 " "
user ""
description ""
extension ""
obj 0000000000000001 1 xyz:0123 -
"""))
with raises(RuntimeError) as exc:
r.readtxn()
assert exc.value.args == ("""+5: invalid line: unknown hash function "xyz" ("obj 0000000000000001 1 xyz:0123 -")""",)
# data integrity error
r = DumpReader(BytesIO(b"""\
txn 0000000000000000 " "
user ""
description ""
extension ""
obj 0000000000000001 5 crc32:01234567
hello
"""))
with raises(RuntimeError) as exc:
r.readtxn()
assert exc.value.args == ("""+6: data corrupt: crc32 = 3610a686, expected 01234567""",)
# -*- coding: utf-8 -*-
# Copyright (C) 2021 Nexedi SA and Contributors.
# Kirill Smelkov <kirr@nexedi.com>
#
# This program is free software: you can Use, Study, Modify and Redistribute
# it under the terms of the GNU General Public License version 3, or (at your
# option) any later version, as published by the Free Software Foundation.
#
# You can also Link and Combine this program with other software covered by
# the terms of any of the Free Software licenses or any of the Open Source
# Initiative approved licenses and Convey the resulting work. Corresponding
# source of such a combination shall include the source code for all other
# software used.
#
# This program is distributed WITHOUT ANY WARRANTY; without even the implied
# warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.
#
# See COPYING file for full licensing terms.
# See https://www.nexedi.com/licensing for rationale and options.
from __future__ import print_function
from zodbtools.zodbrestore import zodbrestore
from zodbtools.util import storageFromURL
from os.path import dirname
from tempfile import mkdtemp
from shutil import rmtree
from golang import func, defer
# verify zodbrestore.
@func
def test_zodbrestore():
tmpd = mkdtemp('', 'zodbrestore.')
defer(lambda: rmtree(tmpd))
# restore from testdata/1.zdump.ok and verify it gives result that is
# bit-to-bit identical to testdata/1.fs
tdata = dirname(__file__) + "/testdata"
@func
def _():
zdump = open("%s/1.zdump.ok" % tdata, 'rb')
defer(zdump.close)
stor = storageFromURL('%s/2.fs' % tmpd)
defer(stor.close)
zodbrestore(stor, zdump)
_()
zfs1 = _readfile("%s/1.fs" % tdata)
zfs2 = _readfile("%s/2.fs" % tmpd)
assert zfs1 == zfs2
# _readfile reads file at path.
def _readfile(path): # -> data(bytes)
with open(path, 'rb') as _:
return _.read()
# -*- coding: utf-8 -*-
# Copyright (C) 2019-2020 Nexedi SA and Contributors.
#
# This program is free software: you can Use, Study, Modify and Redistribute
# it under the terms of the GNU General Public License version 3, or (at your
# option) any later version, as published by the Free Software Foundation.
#
# You can also Link and Combine this program with other software covered by
# the terms of any of the Free Software licenses or any of the Open Source
# Initiative approved licenses and Convey the resulting work. Corresponding
# source of such a combination shall include the source code for all other
# software used.
#
# This program is distributed WITHOUT ANY WARRANTY; without even the implied
# warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.
#
# See COPYING file for full licensing terms.
# See https://www.nexedi.com/licensing for rationale and options.
import datetime
import os
import time
import pytest
import pytz
from freezegun import freeze_time
import tzlocal
from zodbtools.util import TidRangeInvalid, TidInvalid, ashex, parse_tid, parse_tidrange
from golang import b
@pytest.fixture
def fake_time():
"""Pytest's fixture to run this test as if now() was 2009-08-30T19:20:00Z
and if the machine timezone was Europe/Paris
"""
initial_tz = os.environ.get("TZ")
os.environ["TZ"] = "Europe/Paris"
time.tzset()
tzlocal.reload_localzone()
reference_time = datetime.datetime(2009, 8, 30, 19, 20, 0, 0,
pytz.utc).astimezone(
pytz.timezone("Europe/Paris"))
with freeze_time(reference_time):
yield
del os.environ["TZ"]
if initial_tz:
os.environ["TZ"] = initial_tz
time.tzset()
def test_tidrange_tid():
assert (
b"\x00\x00\x00\x00\x00\x00\xaa\xaa",
b"\x00\x00\x00\x00\x00\x00\xbb\xbb",
) == parse_tidrange("000000000000aaaa..000000000000bbbb")
assert (b"\x00\x00\x00\x00\x00\x00\xaa\xaa",
None) == parse_tidrange("000000000000aaaa..")
assert (None, b"\x00\x00\x00\x00\x00\x00\xbb\xbb"
) == parse_tidrange("..000000000000bbbb")
assert (None, None) == parse_tidrange("..")
with pytest.raises(TidRangeInvalid) as exc:
parse_tidrange("inv.alid")
assert exc.value.args == ("inv.alid", )
# range is correct, but a TID is invalid
with pytest.raises(TidInvalid) as exc:
parse_tidrange("invalid..")
assert exc.value.args == ("invalid", )
def test_tidrange_date():
assert (
b"\x03\xc4\x85v\x00\x00\x00\x00",
b"\x03\xc4\x88\xa0\x00\x00\x00\x00",
) == parse_tidrange(
"2018-01-01T10:30:00Z..2018-01-02T00:00:00.000000+00:00")
def test_parse_tid():
assert b"\x00\x00\x00\x00\x00\x00\xbb\xbb" == parse_tid("000000000000bbbb")
with pytest.raises(TidInvalid) as exc:
parse_tid("invalid")
assert exc.value.args == ("invalid", )
with pytest.raises(TidInvalid) as exc:
parse_tid('')
assert exc.value.args == ('', )
test_parameters = [] # of (reference_time, reference_tid, input_time)
with open(
os.path.join(
os.path.dirname(__file__), "testdata",
"tid-time-format.txt")) as f:
for line in f:
line = line.strip()
if line and not line.startswith("#"):
test_parameters.append(line.split(" ", 2))
@pytest.mark.parametrize("reference_time,reference_tid,input_time",
test_parameters)
def test_parse_tid_time_format(fake_time, reference_time, reference_tid,
input_time):
assert b(reference_tid) == ashex(parse_tid(input_time))
# check that the reference_tid matches the reference time, mainly
# to check that input is defined correctly.
assert b(reference_tid) == ashex(parse_tid(reference_time))
# -*- coding: utf-8 -*-
# Copyright (C) 2019 Nexedi SA and Contributors.
# Jérome Perrin <jerome@nexedi.com>
#
# This program is free software: you can Use, Study, Modify and Redistribute
# it under the terms of the GNU General Public License version 3, or (at your
# option) any later version, as published by the Free Software Foundation.
#
# You can also Link and Combine this program with other software covered by
# the terms of any of the Free Software licenses or any of the Open Source
# Initiative approved licenses and Convey the resulting work. Corresponding
# source of such a combination shall include the source code for all other
# software used.
#
# This program is distributed WITHOUT ANY WARRANTY; without even the implied
# warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.
#
# See COPYING file for full licensing terms.
# See https://www.nexedi.com/licensing for rationale and options.
import sys
try:
from unittest import mock
except ImportError:
# BBB python2
import mock
import pytest
from zodbtools import zodb
from zodbtools import help as help_module
# zodbrun runs zodb.main with argv and returns exit code + captured stdout/stderr.
def zodbrun(capsys, *argv):
with mock.patch.object(sys, 'argv', ('zodb',) + argv), \
pytest.raises(SystemExit) as excinfo:
zodb.main()
assert len(excinfo.value.args) == 1
ecode = excinfo.value.args[0]
return ecode, capsys.readouterr()
def test_main(capsys):
e, _ = zodbrun(capsys)
assert e == 2
assert "" == _.out
assert "Zodb is a tool for managing ZODB databases." in _.err
e, _ = zodbrun(capsys, '-h')
assert e == 0
assert "Zodb is a tool for managing ZODB databases." in _.out
assert "" == _.err
@pytest.mark.parametrize(
"help_topic",
tuple(zodb.command_dict) + tuple(help_module.topic_dict))
def test_help(capsys, help_topic):
e, _ = zodbrun(capsys, 'help', help_topic)
assert e == 0
assert _.err == ""
assert _.out != ""
# This is the supported time formats for zodbutils <tidrange>
# Format of this file is:
# <reference time> <tid in hex format> <time input format>
#
# These must be run with current time: 2009-08-30T19:20:00Z
# in Europe/Paris timezone.
# ( as a timestamp: 1251660000 )
# some absolute date formats:
# RFC3339
2018-01-01T10:30:00Z 03c4857600000000 2018-01-01T10:30:00Z
1985-04-12T23:20:50.520000Z 02b914f8d78d4fdf 1985-04-12T23:20:50.52Z
1996-12-20T00:39:57Z 03189927f3333333 1996-12-19T16:39:57-08:00
2018-01-01T05:30:00Z 03c4844a00000000 2018-01-01T10:30:00+05:00
# RFC822
1976-08-26T14:29:00Z 02728aa500000000 26 Aug 76 14:29 GMT
1976-08-26T12:29:00Z 02728a2d00000000 26 Aug 76 14:29 +02:00
# RFC850 -> not supported (by go implementation)
#2006-01-02T22:04:05Z 036277cc15555555 Monday, 02-Jan-06 15:04:05 MST
# RFC1123 -> not supported (by go implementation)
#2006-01-02T22:04:05Z 036277cc15555555 Mon, 02 Jan 2006 15:04:05 MST
#2006-01-02T22:04:05Z 036277cc15555555 Mon, 02 Jan 2006 23:04:05 GMT+1
# explicit UTC timezone
2018-01-01T10:30:00Z 03c4857600000000 2018-01-01 10:30:00 UTC
2018-01-02T00:00:00Z 03c488a000000000 2018-01-02 UTC
# Relative formats, based on git's test for approxidate
# (adapted for timezone Europe/Paris and extended a bit)
2009-08-30T19:20:00Z 03805ec800000000 now
2009-08-30T19:19:55Z 03805ec7eaaaaaaa 5 seconds ago
2009-08-30T19:19:55Z 03805ec7eaaaaaaa 5.seconds.ago
2009-08-30T19:10:00Z 03805ebe00000000 10.minutes.ago
2009-08-29T19:20:00Z 0380592800000000 yesterday
2009-08-27T19:20:00Z 03804de800000000 3.days.ago
2009-08-09T19:20:00Z 037fe8a800000000 3.weeks.ago
2009-05-30T19:20:00Z 037e53a800000000 3.months.ago
2009-08-30T19:19:00Z 03805ec700000000 1 minute ago
2009-08-29T19:20:00Z 0380592800000000 1 day ago
2009-07-30T19:20:00Z 037fb06800000000 1 month ago
# go's when does not support "chaining" like this
#2007-05-30T19:20:00Z 036dfaa800000000 2.years.3.months.ago
2009-08-29T04:00:00Z 0380559000000000 6am yesterday
2009-08-29T16:00:00Z 0380586000000000 6pm yesterday
2009-08-30T01:00:00Z 03805a7c00000000 3:00
2009-08-30T13:00:00Z 03805d4c00000000 15:00
2009-08-30T10:00:00Z 03805c9800000000 noon today
2009-08-29T10:00:00Z 038056f800000000 noon yesterday
# this input is a bit weird also, what does "noon pm" mean?
# it seems to trigger a bug in python's parser
# TypeError: can't compare offset-naive and offset-aware datetimes
#2009-01-05T12:00:00Z 037b0bd000000000 January 5th noon pm
# this input is "ambiguous"
#2009-08-29T12:00:00Z 0380577000000000 10am noon
# not supported by date parser
#2009-08-25T19:20:00Z 038042a800000000 last tuesday
# non consistent behavior ( go keep current hour:minutes - python use midnight )
# this also TypeError on python
#2009-07-05T00:00:00Z 037f1f4000000000 July 5th
# parsed as month/day (at least for me ... it might depend on some locale settings other than $TZ ?)
#2009-05-06T00:00:00Z 037dc82000000000 06.05.2009
# go parser is wrong on this one
#2009-06-06T05:00:00Z 037e77ac00000000 Jun 6, 5AM
# go parser is wrong on this one
#2009-06-06T05:00:00Z 037e77ac00000000 5AM Jun 6
2009-06-07T04:00:00Z 037e7d1000000000 6AM, June 7, 2009
# python and go disagree on these two, go see them as 00:00 UTC
#2008-11-30T23:00:00Z 037a3e4400000000 2008-12-01
#2009-11-30T23:00:00Z 03826ac400000000 2009-12-01
#2009-06-04T22:00:00Z 037e706800000000 06/05/2009
# ( end of tests from git )
# more tests
### works with python implementation, but not supported:
#2018-01-01T09:30:00Z 03c4853a00000000 le 1er janvier 2018 à 10h30
#2018-01-01T23:00:00Z 03c4886400000000 2018年1月2日
### some invalid formats that "looks OK"
# wrong format on timezone (should be 2009-06-01T22:00:00+09:00)
#2009-06-01T01:00:00Z 037e5a9c00000000 2009-06-01T10:00:00:+09:00
# day is 34
# ERROR XXX 2009-06-34T22:00:00Z
# one digits hour minutes
# ERROR XXX 2009-06-01T1:2:3
# month use a captital o instead of O
# ERROR XXX 2009-O6-01T22:00:00Z
#!/usr/bin/env python
# -*- coding: utf-8 -*-
# Copyright (C) 2019 Nexedi SA and Contributors.
# Kirill Smelkov <kirr@nexedi.com>
#
# This program is free software: you can Use, Study, Modify and Redistribute
# it under the terms of the GNU General Public License version 3, or (at your
# option) any later version, as published by the Free Software Foundation.
#
# You can also Link and Combine this program with other software covered by
# the terms of any of the Free Software licenses or any of the Open Source
# Initiative approved licenses and Convey the resulting work. Corresponding
# source of such a combination shall include the source code for all other
# software used.
#
# This program is distributed WITHOUT ANY WARRANTY; without even the implied
# warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.
#
# See COPYING file for full licensing terms.
# See https://www.nexedi.com/licensing for rationale and options.
"""utilities for testing"""
from ZODB.FileStorage import FileStorage
from ZODB import DB
import transaction
from tempfile import mkdtemp
from shutil import rmtree
from golang import func, defer
# zext_supported checks whether ZODB supports txn.extension_bytes .
_zext_supported_memo = None
def zext_supported():
global _zext_supported_memo
if _zext_supported_memo is not None:
return _zext_supported_memo
_ = _zext_supported_memo = _zext_supported()
return _
@func
def _zext_supported():
tmpd = mkdtemp('', 'zext_check.')
defer(lambda: rmtree(tmpd))
dbfs = tmpd + '/1.fs'
stor = FileStorage(dbfs, create=True)
db = DB(stor)
conn = db.open()
root = conn.root()
root._p_changed = True
txn = transaction.get()
txn.setExtendedInfo('a', 'b')
txn.commit()
for last_txn in stor.iterator(start=stor.lastTransaction()):
break
else:
assert False, "cannot see details of last transaction"
assert last_txn.extension == {'a': 'b'}
return hasattr(last_txn, 'extension_bytes')
# -*- coding: utf-8 -*-
# zodbtools - various utility routines
# Copyright (C) 2016-2017 Nexedi SA and Contributors.
# Copyright (C) 2016-2019 Nexedi SA and Contributors.
# Kirill Smelkov <kirr@nexedi.com>
# Jérome Perrin <jerome@nexedi.com>
#
# This program is free software: you can Use, Study, Modify and Redistribute
# it under the terms of the GNU General Public License version 3, or (at your
......@@ -18,14 +20,23 @@
# See COPYING file for full licensing terms.
# See https://www.nexedi.com/licensing for rationale and options.
import hashlib
import hashlib, struct, codecs, io
import zodburi
from six.moves.urllib_parse import urlsplit, urlunsplit
from zlib import crc32, adler32
from ZODB.TimeStamp import TimeStamp
import dateparser
def ashex(s):
return s.encode('hex')
# type: (bytes) -> bytes
return codecs.encode(s, 'hex')
def fromhex(s):
# type: (Union[str,bytes]) -> bytes
return codecs.decode(s, 'hex')
def sha1(data):
# type: (bytes) -> bytes
m = hashlib.sha1()
m.update(data)
return m.digest()
......@@ -57,20 +68,73 @@ def txnobjv(txn):
return objv
# "tidmin..tidmax" -> (tidmin, tidmax)
class TidRangeInvalid(Exception):
class TidInvalid(ValueError):
pass
class TidRangeInvalid(ValueError):
pass
def parse_tid(tid_string, raw_only=False):
"""Try to parse `tid_string` as a time and returns the
corresponding raw TID.
If `tid_string` cannot be parsed as a time, assume it was
already a TID.
This function also raise TidRangeInvalid when `tid_string`
is invalid.
"""
assert isinstance(tid_string, (str, bytes))
# If it "looks like a TID", don't try to parse it as time,
# because parsing is slow.
if len(tid_string) == 16:
try:
return fromhex(tid_string)
except ValueError:
pass
if raw_only:
# either it was not 16-char string or hex decoding failed
raise TidInvalid(tid_string)
# preprocess to support `1.day.ago` style formats like git log does.
if "ago" in tid_string:
tid_string = tid_string.replace(".", " ").replace("_", " ")
parsed_time = dateparser.parse(
tid_string,
settings={
'TO_TIMEZONE': 'UTC',
'RETURN_AS_TIMEZONE_AWARE': True
})
if not parsed_time:
# parsing as date failed
raise TidInvalid(tid_string)
# build a ZODB.TimeStamp to convert as a TID
return TimeStamp(
parsed_time.year,
parsed_time.month,
parsed_time.day,
parsed_time.hour,
parsed_time.minute,
parsed_time.second + parsed_time.microsecond / 1000000.).raw()
# parse_tidrange parses a string into (tidmin, tidmax).
#
# see `zodb help tidrange` for accepted tidrange syntax.
def parse_tidrange(tidrange):
try:
tidmin, tidmax = tidrange.split("..")
except ValueError: # not exactly 2 parts in between ".."
raise TidRangeInvalid(tidrange)
try:
tidmin = tidmin.decode("hex")
tidmax = tidmax.decode("hex")
except TypeError: # hex decoding error
raise TidRangeInvalid(tidrange)
if tidmin:
tidmin = parse_tid(tidmin)
if tidmax:
tidmax = parse_tid(tidmax)
# empty tid means -inf / +inf respectively
# ( which is None in IStorage.iterator() )
......@@ -101,3 +165,73 @@ def storageFromURL(url, read_only=None):
stor = stor_factory()
return stor
# ---- hashing ----
# hasher that discards data
class NullHasher:
name = "null"
digest_size = 1
def update(self, data):
pass
def digest(self):
return b'\0'
def hexdigest(self):
return "00"
# adler32 in hashlib interface
class Adler32Hasher:
name = "adler32"
digest_size = 4
def __init__(self):
self._h = adler32(b'')
def update(self, data):
self._h = adler32(data, self._h)
def digest(self):
return struct.pack('>I', self._h & 0xffffffff)
def hexdigest(self):
return '%08x' % (self._h & 0xffffffff)
# crc32 in hashlib interface
class CRC32Hasher:
name = "crc32"
digest_size = 4
def __init__(self):
self._h = crc32(b'')
def update(self, data):
self._h = crc32(data, self._h)
def digest(self):
return struct.pack('>I', self._h & 0xffffffff)
def hexdigest(self):
return '%08x' % (self._h & 0xffffffff)
# {} name -> hasher
hashRegistry = {
"null": NullHasher,
"adler32": Adler32Hasher,
"crc32": CRC32Hasher,
"sha1": hashlib.sha1,
"sha256": hashlib.sha256,
"sha512": hashlib.sha512,
}
# ---- IO ----
# asbinstream return binary stream associated with stream.
# For example on py3 sys.stdout is io.TextIO which does not allow to write binary data to it.
def asbinstream(stream):
# type: (IO) -> BinaryIO
if isinstance(stream, io.TextIOBase):
return stream.buffer
return stream
#!/usr/bin/env python
# Copyright (C) 2017 Nexedi SA and Contributors.
# Kirill Smelkov <kirr@nexedi.com>
# -*- coding: utf-8 -*-
# Copyright (C) 2017-2021 Nexedi SA and Contributors.
# Kirill Smelkov <kirr@nexedi.com>
# Jérome Perrin <jerome@nexedi.com>
#
# This program is free software: you can Use, Study, Modify and Redistribute
# it under the terms of the GNU General Public License version 3, or (at your
......@@ -35,7 +37,7 @@ def register_command(cmdname):
command_module = importlib.import_module('zodbtools.zodb' + cmdname)
command_dict[cmdname] = command_module
for _ in ('analyze', 'catobj', 'cmp', 'dump', 'info'):
for _ in ('analyze', 'catobj', 'cmp', 'dump', 'info', 'restore'):
register_command(_)
......@@ -51,10 +53,7 @@ Usage:
The commands are:
""", file=out)
cmdv = command_dict.keys()
cmdv.sort()
for cmd in cmdv:
cmd_module = command_dict[cmd]
for cmd, cmd_module in sorted(command_dict.items()):
print(" %-11s %s" % (cmd, cmd_module.summary), file=out)
print("""\
......
# -*- coding: utf-8 -*-
# Copyright (C) 2002-2017 Zope Foundation + Nexedi + Contributors
# See LICENSE-ZPL.txt for full licensing terms.
# Based on a transaction analyzer by Matt Kromer.
from __future__ import print_function
import sys
import os
import getopt
import anydbm as dbm
from six.moves import dbm_gnu as dbm
import tempfile
import shutil
from ZODB.FileStorage import FileIterator, FileStorage, packed_version
from ZODB.FileStorage import FileIterator, packed_version
from ZODB.FileStorage.format import FileStorageFormatter
from ZODB.utils import get_pickle_metadata
from zodbtools.util import storageFromURL, parse_tidrange, ashex
from golang import func, defer
class DeltaFileStorage(
FileStorageFormatter,
......@@ -22,8 +27,11 @@ class DeltaFileStorage(
def iterator(self, start=None, stop=None):
return DeltaFileIterator(self._file_name, start, stop)
def close(self):
pass
class DeltaFileIterator(FileIterator):
def __init__(self, filename, start=None, stop=None, pos=0L):
def __init__(self, filename, start=None, stop=None, pos=0):
assert isinstance(filename, str)
file = open(filename, 'rb')
self._file = file
......@@ -68,6 +76,8 @@ class Report:
self.CBYTESMAP = {}
self.FOIDSMAP = {}
self.FBYTESMAP = {}
self.tidmin = None # first scanned transaction
self.tidmax = None # last ----//----
def shorten(s, n):
l = len(s)
......@@ -86,12 +96,18 @@ def shorten(s, n):
def report(rep, csv=False):
delta_fs = rep.delta_fs
if not csv:
print "Processed %d records in %d transactions" % (rep.OIDS, rep.TIDS)
print "Average record size is %7.2f bytes" % (rep.DBYTES * 1.0 / rep.OIDS)
if rep.TIDS == 0:
print ("# ø")
print ("No transactions processed")
return
print ("# %s..%s" % (ashex(rep.tidmin), ashex(rep.tidmax)))
print ("Processed %d records in %d transactions" % (rep.OIDS, rep.TIDS))
print ("Average record size is %7.2f bytes" % (rep.DBYTES * 1.0 / rep.OIDS))
print ("Average transaction size is %7.2f bytes" %
(rep.DBYTES * 1.0 / rep.TIDS))
print "Types used:"
print ("Types used:")
if delta_fs:
if csv:
fmt = "%s,%s,%s,%s,%s"
......@@ -99,9 +115,9 @@ def report(rep, csv=False):
else:
fmt = "%-46s %7s %9s %6s %7s"
fmtp = "%-46s %7d %9d %5.1f%% %7.2f" # per-class format
print fmt % ("Class Name", "T.Count", "T.Bytes", "Pct", "AvgSize")
print (fmt % ("Class Name", "T.Count", "T.Bytes", "Pct", "AvgSize"))
if not csv:
print fmt % ('-'*46, '-'*7, '-'*9, '-'*5, '-'*7)
print (fmt % ('-'*46, '-'*7, '-'*9, '-'*5, '-'*7))
else:
if csv:
fmt = "%s,%s,%s,%s,%s,%s,%s,%s,%s"
......@@ -109,15 +125,13 @@ def report(rep, csv=False):
else:
fmt = "%-46s %7s %9s %6s %7s %7s %9s %7s %9s"
fmtp = "%-46s %7d %9d %5.1f%% %7.2f %7d %9d %7d %9d" # per-class format
print fmt % ("Class Name", "T.Count", "T.Bytes", "Pct", "AvgSize",
"C.Count", "C.Bytes", "O.Count", "O.Bytes")
print (fmt % ("Class Name", "T.Count", "T.Bytes", "Pct", "AvgSize",
"C.Count", "C.Bytes", "O.Count", "O.Bytes"))
if not csv:
print fmt % ('-'*46, '-'*7, '-'*9, '-'*5, '-'*7, '-'*7, '-'*9, '-'*7, '-'*9)
print (fmt % ('-'*46, '-'*7, '-'*9, '-'*5, '-'*7, '-'*7, '-'*9, '-'*7, '-'*9))
fmts = "%46s %7d %8dk %5.1f%% %7.2f" # summary format
typemap = rep.TYPEMAP.keys()
typemap.sort(key=lambda a:rep.TYPESIZE[a])
cumpct = 0.0
for t in typemap:
for t in sorted(rep.TYPEMAP.keys(), key=lambda a:rep.TYPESIZE[a]):
pct = rep.TYPESIZE[t] * 100.0 / rep.DBYTES
cumpct += pct
if csv:
......@@ -125,44 +139,46 @@ def report(rep, csv=False):
else:
t_display = shorten(t, 46)
if delta_fs:
print fmtp % (t_display, rep.TYPEMAP[t], rep.TYPESIZE[t],
pct, rep.TYPESIZE[t] * 1.0 / rep.TYPEMAP[t])
print (fmtp % (t_display, rep.TYPEMAP[t], rep.TYPESIZE[t],
pct, rep.TYPESIZE[t] * 1.0 / rep.TYPEMAP[t]))
else:
print fmtp % (t_display, rep.TYPEMAP[t], rep.TYPESIZE[t],
pct, rep.TYPESIZE[t] * 1.0 / rep.TYPEMAP[t],
rep.COIDSMAP[t], rep.CBYTESMAP[t],
rep.FOIDSMAP.get(t, 0), rep.FBYTESMAP.get(t, 0))
print (fmtp % (t_display, rep.TYPEMAP[t], rep.TYPESIZE[t],
pct, rep.TYPESIZE[t] * 1.0 / rep.TYPEMAP[t],
rep.COIDSMAP[t], rep.CBYTESMAP[t],
rep.FOIDSMAP.get(t, 0), rep.FBYTESMAP.get(t, 0)))
if csv:
return
if delta_fs:
print fmt % ('='*46, '='*7, '='*9, '='*5, '='*7)
print "%46s %7d %9s %6s %6.2f" % ('Total Transactions', rep.TIDS, ' ',
' ', rep.DBYTES * 1.0 / rep.TIDS)
print fmts % ('Total Records', rep.OIDS, rep.DBYTES, cumpct,
rep.DBYTES * 1.0 / rep.OIDS)
print (fmt % ('='*46, '='*7, '='*9, '='*5, '='*7))
print ("%46s %7d %9s %6s %6.2f" % ('Total Transactions', rep.TIDS, ' ',
' ', rep.DBYTES * 1.0 / rep.TIDS))
print (fmts % ('Total Records', rep.OIDS, rep.DBYTES, cumpct,
rep.DBYTES * 1.0 / rep.OIDS))
else:
print fmt % ('='*46, '='*7, '='*9, '='*5, '='*7, '='*7, '='*9, '='*7, '='*9)
print "%46s %7d %9s %6s %6.2fk" % ('Total Transactions', rep.TIDS, ' ',
' ', rep.DBYTES * 1.0 / rep.TIDS / 1024.0)
print fmts % ('Total Records', rep.OIDS, rep.DBYTES / 1024.0, cumpct,
rep.DBYTES * 1.0 / rep.OIDS)
print (fmt % ('='*46, '='*7, '='*9, '='*5, '='*7, '='*7, '='*9, '='*7, '='*9))
print ("%46s %7d %9s %6s %6.2fk" % ('Total Transactions', rep.TIDS, ' ',
' ', rep.DBYTES * 1.0 / rep.TIDS / 1024.0))
print (fmts % ('Total Records', rep.OIDS, rep.DBYTES / 1024.0, cumpct,
rep.DBYTES * 1.0 / rep.OIDS))
print fmts % ('Current Objects', rep.COIDS, rep.CBYTES / 1024.0,
rep.CBYTES * 100.0 / rep.DBYTES,
rep.CBYTES * 1.0 / rep.COIDS)
print (fmts % ('Current Objects', rep.COIDS, rep.CBYTES / 1024.0,
rep.CBYTES * 100.0 / rep.DBYTES,
rep.CBYTES * 1.0 / rep.COIDS))
if rep.FOIDS:
print fmts % ('Old Objects', rep.FOIDS, rep.FBYTES / 1024.0,
rep.FBYTES * 100.0 / rep.DBYTES,
rep.FBYTES * 1.0 / rep.FOIDS)
print (fmts % ('Old Objects', rep.FOIDS, rep.FBYTES / 1024.0,
rep.FBYTES * 100.0 / rep.DBYTES,
rep.FBYTES * 1.0 / rep.FOIDS))
def analyze(path, use_dbm, delta_fs):
@func
def analyze(path, use_dbm, delta_fs, tidmin, tidmax):
if delta_fs:
fs = DeltaFileStorage(path, read_only=1)
else:
fs = FileStorage(path, read_only=1)
fsi = fs.iterator()
fs = storageFromURL(path, read_only=1)
defer(fs.close)
fsi = fs.iterator(tidmin, tidmax)
report = Report(use_dbm, delta_fs)
for txn in fsi:
analyze_trans(report, txn)
......@@ -172,6 +188,10 @@ def analyze(path, use_dbm, delta_fs):
def analyze_trans(report, txn):
report.TIDS += 1
if report.tidmin is None:
# first seen transaction
report.tidmin = txn.tid
report.tidmax = txn.tid
for rec in txn:
analyze_rec(report, rec)
......@@ -220,29 +240,32 @@ def analyze_rec(report, record):
report.CBYTESMAP[type] = report.CBYTESMAP.get(type, 0) + size - fsize
report.TYPEMAP[type] = report.TYPEMAP.get(type, 0) + 1
report.TYPESIZE[type] = report.TYPESIZE.get(type, 0) + size
except Exception, err:
print err
except Exception as err:
print (err, file=sys.stderr)
__doc__ = """%(program)s: Analyzer for ZODB data or repozo deltafs
__doc__ = """%(program)s: Analyzer for FileStorage data or repozo deltafs
usage: %(program)s [options] <storage> [<tidrange>]
usage: %(program)s [options] /path/to/Data.fs (or /path/to/file.deltafs)
<storage> is an URL (see 'zodb help zurl') or /path/to/file.deltafs(*)
<tidrange> is a history range (see 'zodb help tidrange') to analyze.
Options:
-h, --help this help screen
-c, --csv output CSV
-d, --dbm use DBM as temporary storage to limit memory usage
(no meaning for deltafs case)
Note:
(*) Note:
Input deltafs file should be uncompressed.
"""
summary = "analyze FileStorage or repozo deltafs usage"
summary = "analyze ZODB database or repozo deltafs usage"
def usage(stream, msg=None):
if msg:
print >>stream, msg
print >>stream
print >>stream, __doc__ % {"program": "zodb analyze"}
print (msg, file=stream)
print (file=stream)
print (__doc__ % {"program": "zodb analyze"}, file=stream)
def main(argv):
......@@ -250,9 +273,15 @@ def main(argv):
opts, args = getopt.getopt(argv[1:],
'hcd', ['help', 'csv', 'dbm'])
path = args[0]
except (getopt.GetoptError, IndexError), msg:
except (getopt.GetoptError, IndexError) as msg:
usage(sys.stderr, msg)
sys.exit(2)
# parse tidmin..tidmax
tidmin = tidmax = None
if len(args) > 1:
tidmin, tidmax = parse_tidrange(args[1])
csv = False
use_dbm = False
for opt, args in opts:
......@@ -263,15 +292,16 @@ def main(argv):
if opt in ('-h', '--help'):
usage(sys.stdout)
sys.exit()
header = open(path, 'rb').read(4)
if header == packed_version:
delta_fs = False
else:
delta_fs = True
_orig_read_data_header = FileStorageFormatter._read_data_header
def _read_data_header(self, pos, oid=None):
h = _orig_read_data_header(self, pos, oid=oid)
h.tloc = self._tpos
return h
FileStorageFormatter._read_data_header = _read_data_header
report(analyze(path, use_dbm, delta_fs), csv)
# try to see whether it is zurl or a path to file.deltafs
delta_fs = False
if os.path.exists(path):
header = open(path, 'rb').read(4)
if header != packed_version:
delta_fs = True
_orig_read_data_header = FileStorageFormatter._read_data_header
def _read_data_header(self, pos, oid=None):
h = _orig_read_data_header(self, pos, oid=oid)
h.tloc = self._tpos
return h
FileStorageFormatter._read_data_header = _read_data_header
report(analyze(path, use_dbm, delta_fs, tidmin, tidmax), csv)
# -*- coding: utf-8 -*-
# Copyright (C) 2016-2017 Nexedi SA and Contributors.
# Copyright (C) 2016-2018 Nexedi SA and Contributors.
# Kirill Smelkov <kirr@nexedi.com>
#
# This program is free software: you can Use, Study, Modify and Redistribute
......@@ -20,7 +20,7 @@
"""Zodbcmp - Tool to compare two ZODB databases
Zodbcmp compares two ZODB databases in between tidmin..tidmax transaction range
with default range being -∞..+∞ - (whole database).
with default range being 0..+∞ - (whole database).
For comparison both databases are scanned at storage layer and every
transaction content is compared bit-to-bit between the two. The program stops
......@@ -34,6 +34,7 @@ from __future__ import print_function
from zodbtools.util import ashex, inf, nextitem, txnobjv, parse_tidrange, TidRangeInvalid, \
storageFromURL
from time import time
from golang import func, defer
# compare two storage transactions
# 0 - equal, 1 - non-equal
......@@ -120,6 +121,7 @@ Usage: zodb cmp [OPTIONS] <storage1> <storage2> [tidmin..tidmax]
Compare two ZODB databases.
<storageX> is an URL (see 'zodb help zurl') of a ZODB-storage.
<tidrange> is a history range (see 'zodb help tidrange') to compare.
Options:
......@@ -127,6 +129,7 @@ Options:
-h --help show this help
""", file=out)
@func
def main2(argv):
verbose = False
......@@ -159,8 +162,8 @@ def main2(argv):
print("E: invalid tidrange: %s" % e, file=sys.stderr)
sys.exit(2)
stor1 = storageFromURL(storurl1, read_only=True)
stor2 = storageFromURL(storurl2, read_only=True)
stor1 = storageFromURL(storurl1, read_only=True); defer(stor1.close)
stor2 = storageFromURL(storurl2, read_only=True); defer(stor2.close)
zcmp = storcmp(stor1, stor2, tidmin, tidmax, verbose)
sys.exit(1 if zcmp else 0)
......
# Copyright (C) 2018-2021 Nexedi SA and Contributors.
# Kirill Smelkov <kirr@nexedi.com>
#
# This program is free software: you can Use, Study, Modify and Redistribute
# it under the terms of the GNU General Public License version 3, or (at your
# option) any later version, as published by the Free Software Foundation.
#
# You can also Link and Combine this program with other software covered by
# the terms of any of the Free Software licenses or any of the Open Source
# Initiative approved licenses and Convey the resulting work. Corresponding
# source of such a combination shall include the source code for all other
# software used.
#
# This program is distributed WITHOUT ANY WARRANTY; without even the implied
# warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.
#
# See COPYING file for full licensing terms.
# See https://www.nexedi.com/licensing for rationale and options.
"""Zodbcommit - Commit new transaction into a ZODB database
Zodbcommit reads transaction description from stdin and commits read data into
ZODB. The transaction to be committed is read in zodbdump format, but without
first 'txn' header line. For example::
user "author"
description "change 123"
extension ""
obj 0000000000000001 4 null:00
ZZZZ
On success the ID of committed transaction is printed to stdout.
On conflict or other problem - the error is printed to stderr exit code is !0.
Zodbcommit requires `at` parameter to be given. This specifies caller idea
about its current database view and is used to detect conflicting simultaneous
commits. `at` is required because zodbcommit is plumbing-level command and
implicitly using storage last_tid instead of it could hide bugs. In scripts one
can query current database head (last_tid) with `zodb info <stor> last_tid`.
"""
from __future__ import print_function
from zodbtools import zodbdump
from zodbtools.util import ashex, fromhex, storageFromURL
from ZODB.interfaces import IStorageRestoreable
from ZODB.utils import p64, u64, z64
from ZODB.POSException import POSKeyError
from ZODB._compat import BytesIO
from golang import func, defer, panic
import warnings
# zodbcommit commits new transaction into ZODB storage with data specified by
# zodbdump transaction.
#
# txn.tid acts as a flag:
# - with tid=0 the transaction is committed regularly.
# - with tid=!0 the transaction is recreated with exactly that tid via IStorageRestoreable.
#
# tid of created transaction is returned.
_norestoreWarned = set() # of storage class
def zodbcommit(stor, at, txn):
assert isinstance(txn, zodbdump.Transaction)
want_restore = (txn.tid != z64)
have_restore = IStorageRestoreable.providedBy(stor)
# warn once if stor does not implement IStorageRestoreable
if want_restore and (not have_restore):
if type(stor) not in _norestoreWarned:
warnings.warn("restore: %s does not provide IStorageRestoreable ...\n"
"\t... -> will try to emulate it on best-effort basis." %
type(stor), RuntimeWarning)
_norestoreWarned.add(type(stor))
if want_restore:
# even if stor might be not providing IStorageRestoreable and not
# supporting .restore, it can still support .tpc_begin(tid=...). An example
# of this is NEO. We anyway need to be able to specify which transaction ID
# we need to restore transaction with.
stor.tpc_begin(txn, tid=txn.tid)
else:
stor.tpc_begin(txn)
def _():
def current_serial(oid):
return _serial_at(stor, oid, at)
for obj in txn.objv:
data = None # data do be committed - setup vvv
copy_from = None
if isinstance(obj, zodbdump.ObjectCopy):
copy_from = obj.copy_from
data, _, _ = stor.loadBefore(obj.oid, p64(u64(obj.copy_from)+1))
elif isinstance(obj, zodbdump.ObjectDelete):
data = None
elif isinstance(obj, zodbdump.ObjectData):
if isinstance(obj.data, zodbdump.HashOnly):
raise ValueError('cannot commit transaction with hashonly object')
data = obj.data
else:
panic('invalid object record: %r' % (obj,))
# we have the data -> restore/store the object.
# if it will be ConflictError - we just fail and let the caller retry.
if data is None:
stor.deleteObject(obj.oid, current_serial(obj.oid), txn)
else:
if want_restore and have_restore:
stor.restore(obj.oid, txn.tid, data, '', copy_from, txn)
else:
# FIXME we don't handle copy_from on commit
# NEO does not support restore, and even if stor supports restore,
# going that way requires to already know tid for transaction we are
# committing. -> we just imitate copy by actually copying data and
# letting the storage deduplicate it.
stor.store(obj.oid, current_serial(obj.oid), data, '', txn)
try:
_()
stor.tpc_vote(txn)
except:
stor.tpc_abort(txn)
raise
# in ZODB >= 5 tpc_finish returns tid directly, but on ZODB 4 it
# does not do so. Since we still need to support ZODB 4, utilize tpc_finish
# callback to know with which tid the transaction was committed.
_ = []
stor.tpc_finish(txn, lambda tid: _.append(tid))
assert len(_) == 1
tid = _[0]
if want_restore and (tid != txn.tid):
panic('restore: restored transaction has tid=%s, but requested was tid=%s' %
(ashex(tid), ashex(txn.tid)))
return tid
# _serial_at returns oid's serial as of @at database state.
def _serial_at(stor, oid, at):
before = p64(u64(at)+1)
try:
xdata = stor.loadBefore(oid, before)
except POSKeyError:
serial = z64
else:
if xdata is None:
serial = z64
else:
_, serial, _ = xdata
return serial
# ----------------------------------------
import sys, getopt
summary = "commit new transaction into a ZODB database"
def usage(out):
print("""\
Usage: zodb commit [OPTIONS] <storage> <at> < input
Commit new transaction into a ZODB database.
The transaction to be committed is read from stdin in zodbdump format without
first 'txn' header line.
<storage> is an URL (see 'zodb help zurl') of a ZODB-storage.
<at> is transaction ID of what is caller idea about its current database view.
On success the ID of committed transaction is printed to stdout.
Options:
-h --help show this help
""", file=out)
@func
def main(argv):
try:
optv, argv = getopt.getopt(argv[1:], "h", ["help"])
except getopt.GetoptError as e:
print(e, file=sys.stderr)
usage(sys.stderr)
sys.exit(2)
for opt, _ in optv:
if opt in ("-h", "--help"):
usage(sys.stdout)
sys.exit(0)
if len(argv) != 2:
usage(sys.stderr)
sys.exit(2)
storurl = argv[0]
at = fromhex(argv[1])
stor = storageFromURL(storurl)
defer(stor.close)
# artificial transaction header with tid=0 to request regular commit
zin = b'txn 0000000000000000 " "\n'
zin += sys.stdin.read()
zin = BytesIO(zin)
zr = zodbdump.DumpReader(zin)
zr.lineno -= 1 # we prepended txn header
txn = zr.readtxn()
tail = zin.read()
if tail:
print('E: +%d: garbage after transaction' % zr.lineno, file=sys.stderr)
sys.exit(1)
tid = zodbcommit(stor, at, txn)
print(ashex(tid))
# Copyright (C) 2016-2017 Nexedi SA and Contributors.
# -*- coding: utf-8 -*-
# Copyright (C) 2016-2021 Nexedi SA and Contributors.
# Kirill Smelkov <kirr@nexedi.com>
# Jérome Perrin <jerome@nexedi.com>
#
# This program is free software: you can Use, Study, Modify and Redistribute
# it under the terms of the GNU General Public License version 3, or (at your
......@@ -16,7 +18,7 @@
#
# See COPYING file for full licensing terms.
# See https://www.nexedi.com/licensing for rationale and options.
"""Zodbdump - Tool to dump content of a ZODB database
r"""Zodbdump - Tool to dump content of a ZODB database
This program dumps content of a ZODB database.
It uses ZODB Storage iteration API to get list of transactions and for every
......@@ -24,8 +26,8 @@ transaction prints transaction's header and information about changed objects.
The information dumped is complete raw information as stored in ZODB storage
and should be suitable for restoring the database from the dump file bit-to-bit
identical to its original(*). It is dumped in semi text-binary format where
object data is output as raw binary and everything else is text.
identical to its original(*) via Zodbrestore. It is dumped in semi text-binary
format where object data is output as raw binary and everything else is text.
There is also shortened mode activated via --hashonly where only hash of object
data is printed without content.
......@@ -53,15 +55,19 @@ TODO also protect txn record by hash.
"""
from __future__ import print_function
from zodbtools.util import ashex, sha1, txnobjv, parse_tidrange, TidRangeInvalid, \
storageFromURL
from zodbtools.util import ashex, fromhex, sha1, txnobjv, parse_tidrange, TidRangeInvalid, \
storageFromURL, hashRegistry, asbinstream
from ZODB._compat import loads, _protocol, BytesIO
from zodbpickle.slowpickle import Pickler as pyPickler
#import pickletools
from ZODB.interfaces import IStorageTransactionInformation
from zope.interface import implementer
import sys
import logging
import logging as log
import re
from golang.gcompat import qq
from golang import func, defer, strconv, b
# txn_raw_extension returns raw extension from txn metadata
def txn_raw_extension(stor, txn):
......@@ -74,9 +80,9 @@ def txn_raw_extension(stor, txn):
# in a rational way
stor_name = "(%s, %s)" % (type(stor).__name__, stor.getName())
if stor_name not in _already_warned_notxnraw:
logging.warn("%s: storage does not provide IStorageTransactionInformationRaw ...", stor_name)
logging.warn("... will do best-effort to dump pickles in stable order but this cannot be done 100% correctly")
logging.warn("... please upgrade your ZODB & storage: see https://github.com/zopefoundation/ZODB/pull/183 for details.")
log.warning("%s: storage does not provide IStorageTransactionInformationRaw ...", stor_name)
log.warning("... will do best-effort to dump pickles in stable order but this cannot be done 100% correctly")
log.warning("... please upgrade your ZODB & storage: see https://github.com/zopefoundation/ZODB/pull/183 for details.")
_already_warned_notxnraw.add(stor_name)
return serializeext(txn.extension)
......@@ -86,19 +92,12 @@ _already_warned_notxnraw = set()
# zodbdump dumps content of a ZODB storage to a file.
# please see module doc-string for dump format and details
def zodbdump(stor, tidmin, tidmax, hashonly=False, out=sys.stdout):
first = True
def zodbdump(stor, tidmin, tidmax, hashonly=False, out=asbinstream(sys.stdout)):
for txn in stor.iterator(tidmin, tidmax):
vskip = "\n"
if first:
vskip = ""
first = False
# XXX .status not covered by IStorageTransactionInformation
# XXX but covered by BaseStorage.TransactionRecord
out.write("%stxn %s %s\nuser %s\ndescription %s\nextension %s\n" % (
vskip, ashex(txn.tid), qq(txn.status),
out.write(b"txn %s %s\nuser %s\ndescription %s\nextension %s\n" % (
ashex(txn.tid), qq(txn.status),
qq(txn.user),
qq(txn.description),
qq(txn_raw_extension(stor, txn)) ))
......@@ -106,31 +105,33 @@ def zodbdump(stor, tidmin, tidmax, hashonly=False, out=sys.stdout):
objv = txnobjv(txn)
for obj in objv:
entry = "obj %s " % ashex(obj.oid)
entry = b"obj %s " % ashex(obj.oid)
write_data = False
if obj.data is None:
entry += "delete"
entry += b"delete"
# was undo and data taken from obj.data_txn
elif obj.data_txn is not None:
entry += "from %s" % ashex(obj.data_txn)
entry += b"from %s" % ashex(obj.data_txn)
else:
# XXX sha1 is hardcoded for now. Dump format allows other hashes.
entry += "%i sha1:%s" % (len(obj.data), ashex(sha1(obj.data)))
entry += b"%i sha1:%s" % (len(obj.data), ashex(sha1(obj.data)))
write_data = True
out.write(entry)
out.write(b(entry))
if write_data:
if hashonly:
out.write(" -")
out.write(b" -")
else:
out.write("\n")
out.write(b"\n")
out.write(obj.data)
out.write("\n")
out.write(b"\n")
out.write(b"\n")
# ----------------------------------------
# XPickler is Pickler that tries to save objects stably
......@@ -230,10 +231,11 @@ summary = "dump content of a ZODB database"
def usage(out):
print("""\
Usage: zodb dump [OPTIONS] <storage> [tidmin..tidmax]
Usage: zodb dump [OPTIONS] <storage> [<tidrange>]
Dump content of a ZODB database.
<storage> is an URL (see 'zodb help zurl') of a ZODB-storage.
<tidrange> is a history range (see 'zodb help tidrange') to dump.
Options:
......@@ -241,6 +243,7 @@ Options:
-h --help show this help
""", file=out)
@func
def main(argv):
hashonly = False
......@@ -274,5 +277,253 @@ def main(argv):
sys.exit(2)
stor = storageFromURL(storurl, read_only=True)
defer(stor.close)
zodbdump(stor, tidmin, tidmax, hashonly)
# ----------------------------------------
# dump reading/parsing
_txn_re = re.compile(br'^txn (?P<tid>[0-9a-f]{16}) "(?P<status>.)"$')
_obj_re = re.compile(br'^obj (?P<oid>[0-9a-f]{16}) ((?P<delete>delete)|from (?P<from>[0-9a-f]{16})|(?P<size>[0-9]+) (?P<hashfunc>\w+):(?P<hash>[0-9a-f]+)(?P<hashonly> -)?)')
# _ioname returns name of the reader r, if it has one.
# if there is no name - '' is returned.
def _ioname(r):
return getattr(r, 'name', '')
# DumpReader wraps IO reader to read transactions from zodbdump stream.
#
# The reader must provide .readline() and .read() methods.
# The reader must be opened in binary mode.
class DumpReader(object):
# .lineno - line number position in read stream
def __init__(self, r):
self._r = r
self._line = None # last read line
self.lineno = 0
def _readline(self):
l = self._r.readline()
if l == b'':
self._line = None
return None # EOF
l = l.rstrip(b'\n')
self.lineno += 1
self._line = l
return l
# report a problem found around currently-read line
def _badline(self, msg):
raise RuntimeError("%s+%d: invalid line: %s (%s)" % (_ioname(self._r), self.lineno, msg, qq(self._line)))
# readtxn reads one transaction record from input stream and returns
# Transaction instance or None at EOF.
def readtxn(self):
# header
l = self._readline()
if l is None:
return None
m = _txn_re.match(l)
if m is None:
self._badline('no txn start')
tid = fromhex(m.group('tid'))
status = m.group('status')
def get(name):
l = self._readline()
if l is None or not l.startswith(b'%s ' % name):
self._badline('no %s' % name)
return strconv.unquote(l[len(name) + 1:])
user = get(b'user')
description = get(b'description')
extension = get(b'extension')
# objects
objv = []
while 1:
l = self._readline()
if l == b'':
break # empty line - end of transaction
if l is None or not l.startswith(b'obj '):
self._badline('no obj')
m = _obj_re.match(l)
if m is None:
self._badline('invalid obj entry')
obj = None # will be Object*
oid = fromhex(m.group('oid'))
from_ = m.group('from')
if m.group('delete'):
obj = ObjectDelete(oid)
elif from_:
copy_from = fromhex(from_)
obj = ObjectCopy(oid, copy_from)
else:
size = int(m.group('size'))
hashfunc = m.group('hashfunc')
hashok = fromhex(m.group('hash'))
hashonly = m.group('hashonly') is not None
data = None # see vvv
hcls = hashRegistry.get(hashfunc)
if hcls is None:
self._badline('unknown hash function %s' % qq(hashfunc))
if hashonly:
data = HashOnly(size)
else:
# XXX -> io.readfull
n = size+1 # data LF
data = b''
while n > 0:
chunk = self._r.read(n)
data += chunk
n -= len(chunk)
self.lineno += data.count(b'\n')
self._line = None
if data[-1:] != b'\n':
raise RuntimeError('%s+%d: no LF after obj data' % (_ioname(self._r), self.lineno))
data = data[:-1]
# verify data integrity
# TODO option to allow reading corrupted data
h = hcls()
h.update(data)
hash_ = h.digest()
if hash_ != hashok:
raise RuntimeError('%s+%d: data corrupt: %s = %s, expected %s' % (
_ioname(self._r), self.lineno, h.name, ashex(hash_), ashex(hashok)))
obj = ObjectData(oid, data, hashfunc, hashok)
objv.append(obj)
return Transaction(tid, status, user, description, extension, objv)
# Transaction represents one transaction record in zodbdump stream.
@implementer(IStorageTransactionInformation) # TODO -> IStorageTransactionMetaData after switch to ZODB >= 5
class Transaction(object):
# .tid p64 transaction ID
# .status char status of the transaction
# .user bytes transaction author
# .description bytes transaction description
# .extension_bytes bytes transaction extension
# .objv []Object* objects changed by transaction
def __init__(self, tid, status, user, description, extension, objv):
self.tid = tid
self.status = status
self.user = user
self.description = description
self.extension_bytes = extension
self.objv = objv
# ZODB wants to work with extension as {} - try to convert it on the fly.
#
# The conversion can fail for arbitrary .extension_bytes input.
# The conversion should become not needed once
#
# https://github.com/zopefoundation/ZODB/pull/183, or
# https://github.com/zopefoundation/ZODB/pull/207
#
# is in ZODB.
@property
def extension(self):
if not self.extension_bytes:
return {}
return loads(self.extension_bytes)
# ZODB < 5 wants ._extension
@property
def _extension(self):
return self.extension
# zdump returns semi text-binary representation of a record in zodbdump format.
def zdump(self): # -> bytes
z = b'txn %s %s\n' % (ashex(self.tid), qq(self.status))
z += b'user %s\n' % qq(self.user)
z += b'description %s\n' % qq(self.description)
z += b'extension %s\n' % qq(self.extension_bytes)
for obj in self.objv:
z += obj.zdump()
z += b'\n'
return z
# Object is base class for object records in zodbdump stream.
class Object(object):
# .oid p64 object ID
def __init__(self, oid):
self.oid = oid
# ObjectDelete represents objects deletion.
class ObjectDelete(Object):
def __init__(self, oid):
super(ObjectDelete, self).__init__(oid)
def zdump(self):
return b'obj %s delete\n' % (ashex(self.oid))
# ObjectCopy represents object data copy.
class ObjectCopy(Object):
# .copy_from tid copy object data from object's revision tid
def __init__(self, oid, copy_from):
super(ObjectCopy, self).__init__(oid)
self.copy_from = copy_from
def zdump(self):
return b'obj %s from %s\n' % (ashex(self.oid), ashex(self.copy_from))
# ObjectData represents record with object data.
class ObjectData(Object):
# .data HashOnly | bytes
# .hashfunc str hash function used for integrity
# .hash_ bytes hash of the object's data
def __init__(self, oid, data, hashfunc, hash_):
super(ObjectData, self).__init__(oid)
self.data = data
self.hashfunc = hashfunc
self.hash_ = hash_
def zdump(self):
data = self.data
hashonly = isinstance(data, HashOnly)
if hashonly:
size = data.size
else:
size = len(data)
z = b'obj %s %d %s:%s' % (ashex(self.oid), size, self.hashfunc, ashex(self.hash_))
if hashonly:
z += b' -'
else:
z += b'\n'
z += data
z += b'\n'
return z
# HashOnly indicated that this ObjectData record contains only hash and does not contain object data.
class HashOnly(object):
# .size int
def __init__(self, size):
self.size = size
def __repr__(self):
return 'HashOnly(%d)' % self.size
def __eq__(a, b):
return isinstance(b, HashOnly) and a.size == b.size
# -*- coding: utf-8 -*-
# Copyright (C) 2017 Nexedi SA and Contributors.
# Kirill Smelkov <kirr@nexedi.com>
# Copyright (C) 2017-2020 Nexedi SA and Contributors.
# Kirill Smelkov <kirr@nexedi.com>
#
# This program is free software: you can Use, Study, Modify and Redistribute
# it under the terms of the GNU General Public License version 3, or (at your
......@@ -22,13 +22,19 @@
from __future__ import print_function
from zodbtools.util import ashex, storageFromURL
from collections import OrderedDict
from golang import func, defer
import sys
def _last_tid(stor):
print("W: last_tid is deprecated alias for head", file=sys.stderr)
return infoDict["head"](stor)
# {} parameter_name -> get_parameter(stor)
infoDict = OrderedDict([
("name", lambda stor: stor.getName()),
("size", lambda stor: stor.getSize()),
("last_tid", lambda stor: ashex(stor.lastTransaction())),
("head", lambda stor: ashex(stor.lastTransaction())),
("last_tid", _last_tid),
])
def zodbinfo(stor, parameterv):
......@@ -71,6 +77,7 @@ Options:
-h --help show this help
""", file=out)
@func
def main(argv):
try:
optv, argv = getopt.getopt(argv[1:], "h", ["help"])
......@@ -91,5 +98,6 @@ def main(argv):
sys.exit(2)
stor = storageFromURL(storurl, read_only=True)
defer(stor.close)
zodbinfo(stor, argv[1:])
# Copyright (C) 2021 Nexedi SA and Contributors.
# Kirill Smelkov <kirr@nexedi.com>
#
# This program is free software: you can Use, Study, Modify and Redistribute
# it under the terms of the GNU General Public License version 3, or (at your
# option) any later version, as published by the Free Software Foundation.
#
# You can also Link and Combine this program with other software covered by
# the terms of any of the Free Software licenses or any of the Open Source
# Initiative approved licenses and Convey the resulting work. Corresponding
# source of such a combination shall include the source code for all other
# software used.
#
# This program is distributed WITHOUT ANY WARRANTY; without even the implied
# warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.
#
# See COPYING file for full licensing terms.
# See https://www.nexedi.com/licensing for rationale and options.
"""Zodbrestore - Restore content of a ZODB database.
Zodbrestore reads transactions from zodbdump output and recreates them in a
ZODB storage. See Zodbdump documentation for details.
"""
from __future__ import print_function
from zodbtools.zodbdump import DumpReader
from zodbtools.zodbcommit import zodbcommit
from zodbtools.util import asbinstream, ashex, storageFromURL
from golang import func, defer
# zodbrestore restores transactions read from reader r in zodbdump format.
#
# restoredf, if !None, is called for every restored transaction.
def zodbrestore(stor, r, restoredf=None):
zr = DumpReader(r)
at = stor.lastTransaction()
while 1:
txn = zr.readtxn()
if txn is None:
break
zodbcommit(stor, at, txn)
if restoredf != None:
restoredf(txn)
at = txn.tid
# ----------------------------------------
import sys, getopt
summary = "restore content of a ZODB database"
def usage(out):
print("""\
Usage: zodb restore [OPTIONS] <storage> < input
Restore content of a ZODB database.
The transactions to restore are read from stdin in zodbdump format.
On success the ID of every restored transaction is printed to stdout.
<storage> is an URL (see 'zodb help zurl') of a ZODB-storage.
Options:
-h --help show this help
""", file=out)
@func
def main(argv):
try:
optv, argv = getopt.getopt(argv[1:], "h", ["help"])
except getopt.GetoptError as e:
print(e, file=sys.stderr)
usage(sys.stderr)
sys.exit(2)
for opt, _ in optv:
if opt in ("-h", "--help"):
usage(sys.stdout)
sys.exit(0)
if len(argv) != 1:
usage(sys.stderr)
sys.exit(2)
storurl = argv[0]
stor = storageFromURL(storurl)
defer(stor.close)
def _(txn):
print(ashex(txn.tid))
zodbrestore(stor, asbinstream(sys.stdin), _)
Markdown is supported
0%
or
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment